KR-102961764-B1 - Co-component secondary transformation

KR102961764B1KR 102961764 B1KR102961764 B1KR 102961764B1KR-102961764-B1

Abstract

A method for decoding an encoded video bitstream using at least one processor comprises the steps of: acquiring an encoded video bitstream—the encoded video bitstream includes encoded color components—; entropy parsing the encoded color components; dequantizing the color components and acquiring transform coefficients of the color components; applying a joint components secondary transform (JCST) to the transform coefficients of the color components and thereby generating JCST outputs; performing an inverse transform on the JCST outputs and thereby acquiring residual components of the color components; and decoding the encoded video bitstream based on the residual components of the color components.

Inventors

자오, 신
예, 세훈
류, 산

Assignees

텐센트 아메리카 엘엘씨

Dates

Publication Date: 20260507
Application Date: 20201102
Priority Date: 20201016

Claims (20)

As a method for decoding an encoded video bitstream using at least one processor: Step of acquiring an encoded video bitstream - the encoded video bitstream includes encoded color components -; A step of entropy parsing the above encoded color components; A step of deprotonating the color components and obtaining transformation coefficients of the color components - the transformation coefficients include Cb and Cr transformation coefficients located in two 4x2 blocks, respectively -; A step of applying a joint components secondary transform (JCST) element-by-element to each pair of Cb and Cr transformation coefficients located at the same corresponding coordinates in the two 4x2 blocks to generate JCST outputs located at the same corresponding coordinates in the different 4x2 blocks - the JCST is a 2-point transformation performed on each pair of Cb and Cr transformation coefficients in the two 4x2 blocks to generate a pair of output coefficients that replace each pair of Cb and Cr transformation coefficients in the two 4x2 blocks -; A step of performing a reverse transformation on the above JCST outputs to obtain residual components of the color components; and A step of decoding the encoded video bitstream based on the residual components of the above color components Includes, A method in which a JCST flag signaling when the above JCST is applied is signaled only when the last non-zero coefficient of the color components applying the above JCST is located in a position following a scanning order greater than a threshold.
A method according to claim 1, wherein the transformation coefficients include Y, Cb, and Cr transformation coefficients.
In paragraph 1, the JCST is a method applied to a limited range of block sizes.
In paragraph 1, A step of obtaining a coded video sequence (CVS) including a picture unit corresponding to a coded picture from the encoded video bitstream; A step of obtaining a picture header (PH) network abstraction layer (NAL) unit included in the above picture unit; A step of acquiring at least one video coding layer (VCL) NAL unit included in the picture unit; A method further comprising the step of parsing the JCST flag that signals at the conversion block level when the above JCST is applied.
In paragraph 1, A step of obtaining a coded video sequence (CVS) including a picture unit corresponding to a coded picture from the encoded video bitstream; A step of obtaining a picture header (PH) network abstraction layer (NAL) unit included in the above picture unit; A step of acquiring at least one video coding layer (VCL) NAL unit included in the picture unit; A method further comprising the step of parsing the JCST flag that signals at the CU or CB level when the above JCST is applied.
In paragraph 1, A step of obtaining a coded video sequence (CVS) including a picture unit corresponding to a coded picture from the encoded video bitstream; A step of obtaining a picture header (PH) network abstraction layer (NAL) unit included in the above picture unit; A step of acquiring at least one video coding layer (VCL) NAL unit included in the picture unit; A method further comprising the step of parsing the JCST flag that signals when the above JCST is applied through high-level syntax.
In claim 1, the method wherein the JCST includes a second transformation determined through coding information.
As a device for decoding an encoded video bitstream: At least one memory configured to store computer program code; and It includes at least one processor configured to access the above-mentioned at least one memory and also to operate according to the above-mentioned computer program code, and the above-mentioned computer program code is: A first acquisition code configured to cause at least one processor to acquire an encoded video bitstream—the encoded video bitstream includes encoded color components—; A first parsing code configured to cause at least one processor to perform entropy parsing of the encoded color components; A dequantization code configured to cause at least one processor to dequantize the color components and also to obtain transformation coefficients of the color components—the transformation coefficients include Cb and Cr transformation coefficients located in two 4x2 blocks, respectively—; JCST application code configured to cause at least one processor to apply a joint components secondary transform (JCST) element-by-element to each pair of Cb and Cr transformation coefficients located at the same corresponding coordinates in each of the two 4x2 blocks, thereby generating JCST outputs located at the same corresponding coordinates in each of the different 4x2 blocks - said JCST is a 2-point transformation performed on each pair of Cb and Cr transformation coefficients in the two 4x2 blocks to generate a pair of output coefficients replacing each pair of Cb and Cr transformation coefficients in the two 4x2 blocks -; A reverse conversion code configured to cause at least one processor to apply a reverse conversion to the JCST outputs to obtain residual components of the color components; and A decoding code configured to cause the above at least one processor to decode the encoded video bitstream based on the residual components of the color components. Includes, A device in which a JCST flag signaling when the above JCST application code is executed is signaled only when the last non-zero coefficient of the color components applying the JCST is located in a position following a scanning order greater than a threshold value.
In claim 8, the device comprises the conversion coefficients Y, Cb, and Cr conversion coefficients.
In claim 8, the JCST (joint components secondary transform) code is configured to cause at least one processor to perform JCST for a limited range of block sizes.
In paragraph 8, the above computer program code is: A second acquisition code configured to cause the above at least one processor to acquire a coded video sequence (CVS) including a picture unit corresponding to a coded picture from the above encoded video bitstream; A third acquisition code configured to cause the above at least one processor to acquire a picture header (PH) network abstraction layer (NAL) unit included in the picture unit; A fourth acquisition code configured to cause the above at least one processor to acquire at least one video coding layer (VCL) NAL unit included in the picture unit; A device further comprising a second parsing code configured to cause the above at least one processor to parse the JCST flag signaling at the conversion block level when the above JCST application code is executed.
In paragraph 8, the above computer program code is: A second acquisition code configured to cause the above at least one processor to acquire a coded video sequence (CVS) including a picture unit corresponding to a coded picture from the above encoded video bitstream; A third acquisition code configured to cause the above at least one processor to acquire a picture header (PH) network abstraction layer (NAL) unit included in the picture unit; A fourth acquisition code configured to cause the above at least one processor to acquire at least one video coding layer (VCL) NAL unit included in the picture unit; An apparatus further comprising a second parsing code configured to cause the at least one processor to parse the JCST flag signaling at the CU or CB level when the JCST application code is executed.
In paragraph 8, the above computer program code is: A second acquisition code configured to cause the above at least one processor to acquire a coded video sequence (CVS) including a picture unit corresponding to a coded picture from the above encoded video bitstream; A third acquisition code configured to cause the above at least one processor to acquire a picture header (PH) network abstraction layer (NAL) unit included in the picture unit; A fourth acquisition code configured to cause the above at least one processor to acquire at least one video coding layer (VCL) NAL unit included in the picture unit; A device further comprising a second parsing code configured to cause the at least one processor to parse the JCST flag signaling when the JCST application code is executed in high-level syntax.
A non-transient computer-readable storage medium storing instructions, wherein the instructions cause at least one processor: Acquire an encoded video bitstream - the encoded video bitstream includes encoded color components -; Entropy parsing of the above encoded color components; The above color components are deprotonated and transformation coefficients of the above color components are obtained - the transformation coefficients include Cb and Cr transformation coefficients located in two 4x2 blocks, respectively -; A joint components secondary transform (JCST) is applied element-by-elementally to each pair of Cb and Cr transformation coefficients located at the same corresponding coordinates in the two 4x2 blocks to generate JCST outputs located at the same corresponding coordinates in the different 4x2 blocks—wherein the JCST is a 2-point transformation performed on each pair of Cb and Cr transformation coefficients in the two 4x2 blocks to generate a pair of output coefficients that replace each pair of Cb and Cr transformation coefficients in the two 4x2 blocks—; Performing a reverse transformation on the above JCST outputs to obtain the residual components of the color components; and Decoding the encoded video bitstream based on the residual components of the above color components causing to do, A non-transient computer-readable storage medium in which a JCST flag signaling when the above JCST is applied is signaled only when the last non-zero coefficient of the color components applying the above JCST is located in a position following a scanning order greater than a threshold value.
delete
delete
delete
delete
delete
delete

Description

Co-component secondary transformation [Cross-reference of related applications] This application claims priority based on U.S. Provisional Application No. 63/020,280 filed May 5, 2020 and U.S. Patent Application No. 17/072,606 filed October 16, 2020, the entire contents of which are incorporated herein. The present disclosure generally relates to the field of data processing, and more specifically to video encoding and decoding. More specifically, embodiments of the present disclosure relate to a joint component secondary transform (JCST) scheme for encoding residuals from multiple color components, e.g., residuals from two chroma components. AV1 (AOMedia Video 1) is an open video coding format designed for video transmission over the Internet. As a successor to VP9, AV1 was developed by AOMedia (Alliance for Open Media), a consortium founded in 2015 that includes semiconductor companies, video-on-demand providers, video content producers, software development companies, and web browser vendors. Many components of the AV1 project originated from previous research efforts by AOMedia members. Individual contributors launched experimental technology platforms years ago. For example, Daala from Xiph/Mozilla released code in 2010; Google's experimental VP9 evolution project, VP10, was announced on September 12, 2014; and Cisco's Thor was announced on August 11, 2015. Built on the VP9 codebase, AV1 incorporates additional technologies, some of which were developed for experimental formats. The first version of the AV1 reference codec (0.1.0) was released on April 7, 2016. AOMedia announced the release of the AV1 bitstream specification on March 28, 2018, along with reference software-based encoders and decoders. On June 25, 2018, verified version 1.0.0 of the AV1 specification was released. On January 8, 2019, verified version 1.0.0 of the AV1 specification was released as Errata 1. The AV1 bitstream specification includes a reference video codec. Figure 1 is a schematic diagram of coded coefficients covered by local templates. FIG. 2 is a block diagram of a communication system according to embodiments. FIG. 3 is a diagram of the arrangement of a G-PCC compressor and a G-PCC decompressor in an environment according to embodiments. FIG. 4 is a schematic diagram of an encoder/decoder scheme according to embodiments. FIG. 5 is a schematic diagram of an encoder/decoder scheme according to embodiments. FIG. 6 is a schematic diagram of pairs of Cb and Cr conversion coefficients from two 4x2 blocks according to embodiments. FIG. 7 is a schematic diagram of JCST applied to two 4x2 Cb and Cr blocks according to embodiments. FIG. 8 is a schematic diagram of a JCST using a 4-point transformation according to embodiments. FIG. 9 is a flowchart illustrating a decoding method according to embodiments. FIG. 10 is a drawing of a computer system suitable for implementing embodiments. The embodiments described in this specification provide a method and apparatus for encoding and/or decoding image data. [Residual Coding in AV1] For each given transform unit, the AV1 coefficient coder begins by coding a skip sign, which is then followed by the transform kernel type location and the end-of-block (EOB) of all non-zero coefficients when the transform coding is not skipped. Subsequently, each coefficient value can be mapped to multiple level maps and codes, where the code plane covers the codes of the coefficients and three level planes, and each coefficient value can correspond to different ranges of coefficient sizes, namely the lower level, middle level, and upper level planes. The lower level plane corresponds to the range of 0-2, the middle level plane corresponds to the range of 3-14, and the upper level plane corresponds to the range of 15 or more. After the EOB position is coded, the lower and middle level planes are coded together in reverse scan order; the lower level plane indicates whether the coefficient magnitude is between 0 and 2, and the middle level plane indicates whether the range is between 3 and 14. Subsequently, the sign plane and upper level plane are coded together in forward scan order, and the upper level plane represents residual values with a magnitude greater than 14. The remainder is entropy-coded using the Exp-Golomb code. AV1 adopts the traditional zigzag scan order. This separation allows for the allocation of a rich context model to the lower-level plane, which considers transformation directions including bidirectional, horizontal, and vertical; transformation size; and up to five neighbor factors for improved compression efficiency at a standard context model size. The intermediate-level plane uses a context model similar to the lower-level plane, with the number of context neighbor factors reduced from five to two. The upper-level plane is coded using Exp-Golomb codes without using a context model. In the code plane, codes are coded using the DC codes of neighboring transformation units as context