US-12621496-B2 - Methods and apparatus of video coding in 4:4:4 chroma format

US12621496B2US 12621496 B2US12621496 B2US 12621496B2US-12621496-B2

Abstract

An electronic apparatus performs a method of decoding video data, including receiving, from bitstream, a first syntax element in a slice header of a slice that indicates whether luma mapping with chroma scaling (LMCS) is applied to a coding unit in the slice; receiving a second syntax element for the coding unit that indicates whether the coding unit has been coded using adaptive color-space transform (ACT); if the second syntax element has a non-zero value, decoding the coding unit by applying inverse ACT to convert luma and chroma residuals of the coding unit from a transformed color space to an original color space of the video data; and if the first syntax element has a non-zero value, decoding the coding unit by performing inverse luma mapping to the luma samples and inverse scaling to the chroma residuals of the coding unit after performing the inverse ACT.

Inventors

Xiaoyu Xiu
Yi-Wen Chen
Tsung-Chuan MA
HONG-JHENG JHU
Xianglin Wang
Bing Yu

Assignees

Beijing Dajia Internet Information Technology Co., Ltd.

Dates

Publication Date: 20260505
Application Date: 20240703

Claims (20)

1 . A method of video encoding, comprising: determining whether a coding unit has been coded using adaptive color-space transform (ACT), wherein the coding unit is coded by intra-prediction mode; in accordance with a determination that the coding unit has not been coded using ACT: determining whether chroma components of the coding unit have been coded using block differential pulse coded modulation (BDPCM); and decoding chroma components of the coding unit based on the determination of whether the chroma components of the coding unit have been coded using BDPCM; and in accordance with a determination that the coding unit has been coded using ACT: decoding the coding unit by applying inverse ACT.
2 . The method of claim 1 , wherein the method further comprises: signaling a first syntax element, wherein the first syntax element indicates whether the coding unit has been coded using ACT; in accordance with a determination that the first syntax element has a zero value: signaling one or more syntax elements, wherein the one or more syntax elements indicate whether the chroma components of the coding unit have been coded using BDPCM; and in accordance with a determination that the first syntax element has a non-zero value: not signaling the one or more syntax elements, wherein when the one or more syntax elements are not signaled, default values are assigned to the one or more syntax elements.
3 . The method of claim 2 , wherein the method further comprises: signaling a second syntax element, wherein the second syntax element indicates whether video data has a predefined chroma format, wherein the signaling the first syntax element comprises: in response to the second syntax element indicating that the video data has the predefined chroma format, signaling the first syntax element.
4 . The method of claim 3 , wherein the predefined chroma format is 4:4:4 chroma format.
5 . The method of claim 3 , wherein the first syntax element is signaled only when the video data has the predefined chroma format.
6 . The method of claim 1 , wherein the default values indicate that the chroma components of the coding unit are decoded without using inverse BDPCM.
7 . The method of claim 1 , wherein the coding unit is decoded using inverse BDPCM when the coding unit is coded without applying transforms.
8 . An electronic apparatus comprising: one or more processing units; a memory coupled to the one or more processing units; and a plurality of programs stored in the memory that, when executed by the one or more processing units, cause the electronic apparatus to perform operations comprising: determining whether a coding unit has been coded using adaptive color-space transform (ACT), wherein the coding unit is coded by intra-prediction mode; in accordance with a determination that the coding unit has not been coded using ACT: determining whether chroma components of the coding unit have been coded using block differential pulse coded modulation (BDPCM); and decoding chroma components of the coding unit based on the determination of whether the chroma components of the coding unit have been coded using BDPCM; and in accordance with a determination that the coding unit has been coded using ACT: decoding the coding unit by applying inverse ACT.
9 . The electronic apparatus of claim 8 , wherein the operations further comprise: signaling a first syntax element, wherein the first syntax element indicates whether the coding unit has been coded using ACT; in accordance with a determination that the first syntax element has a zero value: signaling one or more syntax elements, wherein the one or more syntax elements indicate whether the chroma components of the coding unit have been coded using BDPCM; and in accordance with a determination that the first syntax element has a non-zero value: not signaling the one or more syntax elements, wherein when the one or more syntax elements are not signaled, default values are assigned to the one or more syntax elements.
10 . The electronic apparatus of claim 9 , wherein the operations further comprise: signaling a second syntax element, wherein the second syntax element indicates whether video data has a predefined chroma format, wherein the signaling the first syntax element comprises: in response to the second syntax element indicating that the video data has the predefined chroma format, signaling the first syntax element.
11 . The electronic apparatus of claim 10 , wherein the predefined chroma format is 4:4:4 chroma format.
12 . The electronic apparatus of claim 10 , wherein the first syntax element is signaled only when the video data has the predefined chroma format.
13 . The electronic apparatus of claim 9 , wherein the default values indicate that the chroma components of the coding unit are decoded without using inverse BDPCM.
14 . The electronic apparatus of claim 8 , wherein the coding unit is decoded using inverse BDPCM when the coding unit is coded without applying transforms.
15 . A non-transitory computer readable storage medium having stored therein a bitstream comprising video information generated by a method of video encoding, the method comprising: determining whether a coding unit has been coded using adaptive color-space transform (ACT), wherein the coding unit is coded by intra-prediction mode; in accordance with a determination that the coding unit has not been coded using ACT: determining whether chroma components of the coding unit have been coded using block differential pulse coded modulation (BDPCM); and decoding chroma components of the coding unit based on the determination of whether the chroma components of the coding unit have been coded using BDPCM; and in accordance with a determination that the coding unit has been coded using ACT: decoding the coding unit by applying inverse ACT.
16 . The non-transitory computer readable storage medium of claim 15 , wherein the method further comprises: signaling a first syntax element, wherein the first syntax element indicates whether the coding unit has been coded using ACT; in accordance with a determination that the first syntax element has a zero value: signaling one or more syntax elements, wherein the one or more syntax elements indicate whether the chroma components of the coding unit have been coded using BDPCM; and in accordance with a determination that the first syntax element has a non-zero value: not signaling the one or more syntax elements, wherein when the one or more syntax elements are not signaled, default values are assigned to the one or more syntax elements.
17 . The non-transitory computer readable storage medium of claim 16 , wherein the method further comprises: signaling a second syntax element, wherein the second syntax element indicates whether video data has a predefined chroma format, wherein the signaling the first syntax element comprises: in response to the second syntax element indicating that the video data has the predefined chroma format, signaling the first syntax element.
18 . The non-transitory computer readable storage medium of claim 17 , wherein the predefined chroma format is 4:4:4 chroma format.
19 . The non-transitory computer readable storage medium of claim 17 , wherein the first syntax element is signaled only when the video data has the predefined chroma format.
20 . The non-transitory computer readable storage medium of claim 16 , wherein the default values indicate that the chroma components of the coding unit are decoded without using inverse BDPCM.

Description

RELATED APPLICATIONS The present application is a continuation of U.S. patent application Ser. No. 17/716,971, filed Apr. 8, 2022, which is a continuation of International Application No. PCT/US2020/055264, filed on Oct. 12, 2020, which claims priority to U.S. Provisional Patent Application No. 62/914,282, entitled “METHODS AND APPARATUS OF VIDEO CODING IN 4:4:4 CHROMA FORMAT” filed on Oct. 11, 2019, and also claims priority to U.S. Provisional Patent Application No. 62/923,390, entitled “METHODS AND APPARATUS OF VIDEO CODING IN 4:4:4 CHROMA FORMAT” filed on Oct. 18, 2019, both of which are incorporated by reference in their entirety. TECHNICAL FIELD The present application generally relates to video data coding and compression, and in particular, to method and system of performing adaptive color-space transform (ACT) with chroma residual scaling. BACKGROUND Digital video is supported by a variety of electronic devices, such as digital televisions, laptop or desktop computers, tablet computers, digital cameras, digital recording devices, digital media players, video gaming consoles, smart phones, video teleconferencing devices, video streaming devices, etc. The electronic devices transmit, receive, encode, decode, and/or store digital video data by implementing video compression/decompression standards as defined by MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), and Versatile Video Coding (VVC) standard. Video compression typically includes performing spatial (intra frame) prediction and/or temporal (inter frame) prediction to reduce or remove redundancy inherent in the video data. For block-based video coding, a video frame is partitioned into one or more slices, each slice having multiple video blocks, which may also be referred to as coding tree units (CTUs). Each CTU may contain one coding unit (CU) or recursively split into smaller CUs until the predefined minimum CU size is reached. Each CU (also named leaf CU) contains one or multiple transform units (TUs) and each CU also contains one or multiple prediction units (PUs). Each CU can be coded in either intra, inter or IBC modes. Video blocks in an intra coded (I) slice of a video frame are encoded using spatial prediction with respect to reference samples in neighboring blocks within the same video frame. Video blocks in an inter coded (P or B) slice of a video frame may use spatial prediction with respect to reference samples in neighboring blocks within the same video frame or temporal prediction with respect to reference samples in other previous and/or future reference video frames. Spatial or temporal prediction based on a reference block that has been previously encoded, e.g., a neighboring block, results in a predictive block for a current video block to be coded. The process of finding the reference block may be accomplished by block matching algorithm. Residual data representing pixel differences between the current block to be coded and the predictive block is referred to as a residual block or prediction errors. An inter-coded block is encoded according to a motion vector that points to a reference block in a reference frame forming the predictive block, and the residual block. The process of determining the motion vector is typically referred to as motion estimation. An intra coded block is encoded according to an intra prediction mode and the residual block. For further compression, the residual block is transformed from the pixel domain to a transform domain, e.g., frequency domain, resulting in residual transform coefficients, which may then be quantized. The quantized transform coefficients, initially arranged in a two-dimensional array, may be scanned to produce a one-dimensional vector of transform coefficients, and then entropy encoded into a video bitstream to achieve even more compression. The encoded video bitstream is then saved in a computer-readable storage medium (e.g., flash memory) to be accessed by another electronic device with digital video capability or directly transmitted to the electronic device wired or wirelessly. The electronic device then performs video decompression (which is an opposite process to the video compression described above) by, e.g., parsing the encoded video bitstream to obtain syntax elements from the bitstream and reconstructing the digital video data to its original format from the encoded video bitstream based at least in part on the syntax elements obtained from the bitstream, and renders the reconstructed digital video data on a display of the electronic device. With digital video quality going from high definition, to 4K×2K or even 8K×4K, the amount of vide data to be encoded/decoded grows exponentially. It is a constant challenge in terms of how the video data can be encoded/decoded more efficiently while maintaining the image quality of the decoded video data. Certain video content, e.g., screen content videos, is encoded