EP-4740464-A1 - NONLINEAR INVERSE TRANSFORMS FOR VIDEO COMPRESSION

EP4740464A1EP 4740464 A1EP4740464 A1EP 4740464A1EP-4740464-A1

Abstract

Decoding using rate-distortion optimization including inverse transformation including transform size adaptive directional nonlinear filtering includes generating decoded block data by decoding a current block of a current frame. Decoding the current block includes obtaining quantized transform coefficients for the current block from an encoded bitstream, obtaining transform data for the current block from the encoded bitstream, wherein the transform data indicates a transform type and a transform size, obtaining dequantized transform block data by dequantizing the quantized transform coefficients, and obtaining decoded residual block data by inverse transforming the dequantized transform block data in accordance with the transform data. Inverse transforming the dequantized transform block data includes obtaining intermediate decoded block data by combining prediction block data for the current block and the decoded residual block data and obtaining the decoded block data by filtering the intermediate decoded block data using a transform size adaptive directional nonlinear filter.

Inventors

GULERYUZ, ONUR
CHEN, JIANLE
MUKHERJEE, DEBARGHA
LU, Lester

Assignees

Google LLC

Dates

Publication Date: 20260513
Application Date: 20240806

Claims (20)

CLAIMS What is claimed is: 1. A method comprising: generating encoded block data by encoding a current block from a current frame from an input video, wherein encoding the current block includes: obtaining residual block data indicating a difference between the current block and prediction block data for the current block; and identifying an optimal transform from available transforms for transforming the residual block data by evaluating two or more of the available transforms, wherein identifying the optimal transform includes identifying an optimal transform type and an optimal transform size, wherein evaluating a respective available transform includes: obtaining candidate transform block data by transforming the residual block data using the respective available transform; obtaining candidate quantized block data by quantizing the candidate transform block data; obtaining candidate dequantized transform block data by dequantizing the candidate quantized block data; and obtaining candidate decoded residual block data by inverse transforming the candidate dequantized transform block data, wherein inverse transforming the candidate dequantized transform block data includes: obtaining intermediate decoded block data by combining the prediction block data and the candidate decoded residual block data; and obtaining candidate decoded block data by filtering the intermediate decoded block data using an optimal transform size adaptive directional nonlinear filter; including the candidate quantized block data in an encoded bitstream; and outputting the encoded bitstream.
2. The method of claim 1, wherein evaluating the two or more of the available transforms includes: obtaining rate-distortion optimization costs for the available transforms on a per- available transform basis; and identifying the respective available transform as the optimal transform for coding the current block in response to a determination that the respective available transform corresponds to a minimal rate-distortion optimization cost among the rate-distortion optimization costs.
3. The method of claim 1, wherein identifying the optimal transform includes: identifying an optimal transform type; and identifying an optimal transform size.
4. The method of claim 1, wherein inverse transforming the candidate dequantized transform block data includes: obtaining the optimal transform size adaptive directional nonlinear filter.
5. The method of claim 1, further comprising: obtaining reconstructed block data by reconstruction filtering the candidate decoded block data; including the reconstructed block data in reconstructed frame data for the current frame; and storing the reconstructed frame data for subsequently encoding another frame from the input video.
6. The method of claim 1, further comprising: obtaining reconstructed block data by reconstruction filtering the intermediate decoded block data; including the reconstructed block data in reconstructed frame data for the current frame; and storing the reconstructed frame data for subsequently encoding another frame from the input video.
7. The method of claim 1, wherein including the candidate quantized block data in an encoded bitstream includes: including, in the encoded bitstream, data indicating that using a transform size adaptive directional nonlinear filter is enabled.
8. A method comprising: generating decoded block data by decoding a current block of a current frame, wherein decoding the current block includes: obtaining quantized transform coefficients for the current block from an encoded bitstream; obtaining transform data for the current block from the encoded bitstream, wherein the transform data indicates a transform type and a transform size; obtaining dequantized transform block data by dequantizing the quantized transform coefficients; and obtaining decoded residual block data by inverse transforming the dequantized transform block data in accordance with the transform data, wherein inverse transforming the dequantized transform block data includes: obtaining intermediate decoded block data by combining prediction block data for the current block and the decoded residual block data; and obtaining the decoded block data by filtering the intermediate decoded block data using a transform size adaptive directional nonlinear filter; and outputting the decoded block data.
9. The method of claim 8, wherein outputting the decoded block data includes: obtaining reconstructed block data by reconstruction filtering the decoded block data; including the reconstructed block data in reconstructed frame data for the current frame; and outputting the reconstructed frame data.
10. The method of claim 8, wherein obtaining the transform data includes: obtaining transform type data indicating the transform type; and obtaining transform size data indicating the transform size.
11. The method of claim 8, wherein inverse transforming the dequantized transform block data includes: obtaining the transform size adaptive directional nonlinear filter in accordance with the transform data.
12. The method of claim 8, wherein inverse transforming the dequantized transform block data includes accessing, from the encoded bitstream, data indicating that using the transform size adaptive directional nonlinear filter is enabled.
13. The method of claim 8, wherein: in response to accessing, from the encoded bitstream, data indicating that using the transform size adaptive directional nonlinear filter is disabled: inverse transforming the dequantized transform block data omits filtering the intermediate decoded block data using a transform size adaptive directional nonlinear filter; and outputting the decoded block data includes: obtaining reconstructed block data by reconstruction filtering the intermediate decoded block data; including the reconstructed block data in reconstructed frame data for the current frame; and outputting the reconstructed frame data.
14. An apparatus comprising: a non-transitory computer readable medium; and a processor configured to execute instructions stored on the non-transitory computer readable medium to: generate encoded block data, wherein, to generate the encoded block data, the processor is configured to execute the instructions to encode a current block from a current frame from an input video, wherein, to encode the current block, the processor is configured to execute the instructions to: obtain residual block data that indicates a difference between the current block and prediction block data for the current block; and identify an optimal transform from available transforms for transforming the residual block data, wherein to identify the optimal transform, the processor is configured to execute the instructions to evaluate two or more of the available transforms, wherein the optimal transform includes an optimal transform type and an optimal transform size, wherein, to evaluate a respective available transform, the processor is configured to execute the instructions to: obtain candidate transform block data, wherein, to obtain the candidate transform block data, the processor is configured to execute the instructions to use the respective available transform to transform the residual block data; obtain candidate quantized block data, wherein, to obtain the candidate quantized block data, the processor is configured to execute the instructions to quantize the candidate transform block data; obtain candidate dequantized transform block data, wherein, to obtain the candidate dequantized transform block data, the processor is configured to execute the instructions to dequantize the candidate quantized block data; and obtain candidate decoded residual block data, wherein, to obtain the candidate decoded residual block data, the processor is configured to execute the instructions to inverse transform the candidate dequantized transform block data, wherein to inverse transform the candidate dequantized transform block data, the processor is configured to execute the instructions to: obtain intermediate decoded block data, wherein, to obtain the intermediate decoded block data, the processor is configured to execute the instructions to combine the prediction block data and the candidate decoded residual block data; and obtain candidate decoded block data, wherein, to obtain the candidate decoded block data, the processor is configured to execute the instructions to use an optimal transform size adaptive directional nonlinear filter to filter the intermediate decoded block data; include the candidate quantized block data in an encoded bitstream; and output the encoded bitstream.
15. The apparatus of claim 14, wherein, to evaluate the two or more of the available transforms, the processor is configured to execute the instructions to: obtain rate-distortion optimization costs for the available transforms on a per- available transform basis; and identify the respective available transform as the optimal transform for coding the current block in response to a determination that the respective available transform corresponds to a minimal rate-distortion optimization cost among the rate-distortion optimization costs.
16. The apparatus of claim 14, wherein, to identify the optimal transform, the processor is configured to execute the instructions to: identify an optimal transform type; and identify an optimal transform size.
17. The apparatus of claim 14, wherein, to inverse transform the candidate dequantized transform block data, the processor is configured to execute the instructions to: obtain the optimal transform size adaptive directional nonlinear filter.
18. The apparatus of claim 14, wherein the processor is configured to execute the instructions to obtain reconstructed block data, wherein the processor is configured to execute the instructions to: reconstruction filter the candidate decoded block data to obtain the reconstructed block data; include the reconstructed block data in reconstructed frame data for the current frame; and store the reconstructed frame data for subsequently encoding another frame from the input video.
19. The apparatus of claim 14, wherein the processor is configured to execute the instructions to obtain reconstructed block data, wherein the processor is configured to execute the instructions to: reconstruction filter the intermediate decoded block data to obtain the reconstructed block data; include the reconstructed block data in reconstructed frame data for the current frame; and store the reconstructed frame data for subsequently encoding another frame from the input video.
20. The apparatus of claim 14, wherein the processor is configured to execute the instructions to: include, in the encoded bitstream, data indicating that using a transform size adaptive directional nonlinear filter is enabled.

Description

NONLINEAR INVERSE TRANSFORMS FOR VIDEO COMPRESSION CROSS-REFERENCE TO RELATED APPLICATION(S) [0001] This application claims priority to and the benefit of U.S. Provisional Application Patent Serial No. 63/531,397, filed August 08, 2023, the entire disclosure of which is hereby incorporated by reference. BACKGROUND [0002] Digital images and video can be used, for example, on the internet, for remote business meetings via video conferencing, high-definition video entertainment, video advertisements, or sharing of user-generated content. Due to the large amount of data involved in transferring and processing image and video data, high-performance compression may be advantageous for transmission and storage. Accordingly, it would be advantageous to provide high-resolution image and video transmitted over communications channels having limited bandwidth. SUMMARY [0003] This application relates to encoding and decoding of image data, video stream data, or both for transmission, storage, or both. Disclosed herein are aspects of systems, methods, and apparatuses for encoding and decoding using rate-distortion optimization including inverse transformation including transform size adaptive directional nonlinear filtering. [0004] Variations in these and other aspects will be described in additional detail hereafter. [0005] An aspect is a method for encoding using rate-distortion optimization including inverse transformation including transform size adaptive directional nonlinear filtering. Encoding using rate-distortion optimization including inverse transformation including transform size adaptive directional nonlinear filtering may include generating encoded block data by encoding a current block from a current frame from an input video. Encoding the current block may include obtaining residual block data indicating a difference between the current block and prediction block data for the current block and identifying an optimal transform from available transforms for transforming the residual block data by evaluating two or more of the available transforms, wherein identifying the optimal transform includes identifying an optimal transform type and an optimal transform size. Evaluating a respective available transform may include obtaining candidate transform block data by transforming the residual block data using the respective available transform, obtaining candidate quantized block data by quantizing the candidate transform block data, obtaining candidate dequantized transform block data by dequantizing the candidate quantized block data, and obtaining candidate decoded residual block data by inverse transforming the candidate dequantized transform block data. Inverse transforming the candidate dequantized transform block data may include obtaining intermediate decoded block data by combining the prediction block data and the candidate decoded residual block data and obtaining candidate decoded block data by filtering the intermediate decoded block data using an optimal transform size adaptive directional nonlinear filter. Encoding the current block may include including the candidate quantized block data in an encoded bitstream and outputting the encoded bitstream. [0006] An aspect is a method for decoding using rate-distortion optimization including inverse transformation including transform size adaptive directional nonlinear filtering. Decoding using rate-distortion optimization including inverse transformation including transform size adaptive directional nonlinear filtering may include generating decoded block data by decoding a current block of a current frame and outputting the decoded block data. Decoding the current block may include obtaining quantized transform coefficients for the current block from an encoded bitstream, obtaining transform data for the current block from the encoded bitstream, wherein the transform data indicates a transform type and a transform size, obtaining dequantized transform block data by dequantizing the quantized transform coefficients, and obtaining decoded residual block data by inverse transforming the dequantized transform block data in accordance with the transform data. Inverse transforming the dequantized transform block data may include obtaining intermediate decoded block data by combining prediction block data for the current block and the decoded residual block data and obtaining the decoded block data by filtering the intermediate decoded block data using a transform size adaptive directional nonlinear filter. [0007] An aspect is a non-transitory computer-readable storage medium having stored thereon an encoded bitstream comprising quantized transform coefficients for a current block from a current frame of a video and transform data for the current block. The transform data indicates a transform type and a transform size of a transform for inverse transforming dequantized transform block data, the dequantized transform block data corresponding to dequantizi