CN-122002041-A - Method and apparatus for video encoding and decoding

CN122002041ACN 122002041 ACN122002041 ACN 122002041ACN-122002041-A

Abstract

An electronic device performs a method of decoding video data, including receiving video data corresponding to an encoding unit from a bitstream, wherein the encoding unit is encoded in an inter prediction mode or an intra block copy mode, receiving a first syntax element from the video data, wherein the first syntax element indicates whether the encoding unit has any non-zero residuals, receiving a second syntax element from the video data based on a determination that the first syntax element has non-zero values, wherein the second syntax element indicates whether the encoding unit has been encoded using an adaptive color space transform (ACT), assigning a zero value to the second syntax element based on a determination that the first syntax element has a zero value, and determining whether to perform an inverse ACT on the video data of the encoding unit based on the value of the second syntax element.

Inventors

XIU XIAOYU
CHEN YIWEN
MA ZONGQUAN
ZHU HONGZHENG
WANG XIANGLIN
YU BING

Assignees

北京达佳互联信息技术有限公司

Dates

Publication Date: 20260508
Application Date: 20200923
Priority Date: 20190923

Claims (20)

1. A method of decoding video data, comprising: receiving the video data corresponding to a coding unit from a bitstream, wherein the coding unit is encoded in an inter prediction mode or an intra block copy mode; Determining a first syntax element from the video data, wherein the first syntax element indicates whether residual data is present for the coding unit, wherein the residual data has a transform tree structure; in accordance with a determination that the first syntax element has a non-zero value: Determining a second syntax element from the video data, wherein the second syntax element indicates whether the coding unit has been applied with an adaptive color space transform ACT; performing an inverse ACT on the residual data of the coding unit according to non-zero values of the second syntax element.
2. The method of claim 1, further comprising: It is determined that the inverse ACT is not performed on the residual data of the coding unit according to a zero value of the second syntax element.
3. The method of claim 1, further comprising: a zero value is assigned to the second syntax element in accordance with a determination that the first syntax element has a zero value.
4. The method of claim 1, wherein the coding unit encodes with the ACT in a lossy manner.
5. The method of claim 1, wherein the encoding unit encodes with the ACT in a lossless manner.
6. The method of claim 1, wherein the inverse ACT is from YCgCo color space to YCbCr color space.
7. The method of claim 1, wherein the inverse ACT is from YCgCo color space to RGB color space.
8. The method of claim 1, wherein the coding unit is encoded in a 4:4:4 chroma format.
9. The method of claim 1, wherein the first syntax element having the zero value indicates that the residual data is not present for the coding unit and the first syntax element having the non-zero value indicates that the residual data is present for the coding unit.
10. The method of claim 1, wherein the second syntax element is a cu_act_enabled flag, the cu_act_enabled flag being a CU level flag.
11. A method of encoding an encoding unit within a video frame, comprising: Performing inter prediction or intra block copy prediction on the coding unit; Determining a value of a first syntax element, wherein the first syntax element indicates whether residual data is present for the coding unit, wherein the residual data has a transform tree structure; Determining a value of a second syntax element, wherein the second syntax element indicates whether the coding unit has been applied with ACT, and The value of the second syntax element is signaled based on a determination that the first syntax element has a non-zero value.
12. The method of claim 11, further comprising: ACT is performed on the residual data of the coding unit based on determining that the second syntax element has a non-zero value.
13. The method of claim 11, further comprising: determining not to perform ACT on the residual data of the coding unit based on determining that the second syntax element has a zero value.
14. The method of claim 11, wherein the coding unit encodes with the ACT in a lossy manner.
15. The method of claim 11, wherein the encoding unit encodes with the ACT in a lossless manner.
16. The method of claim 11, wherein the ACT is from YCbCr color space to YCgCo color space.
17. The method of claim 11, wherein the ACT is from an RGB color space to a YCgCo color space.
18. The method of claim 11, wherein the coding unit is encoded in a 4:4:4 chroma format.
19. The method of claim 11, wherein the first syntax element having the zero value indicates that the residual data is not present for the coding unit and the first syntax element having the non-zero value indicates that the residual data is present for the coding unit.
20. The method of claim 11, wherein the second syntax element is a cu_act_enabled flag, the cu_act_enabled flag being a CU level flag.

Description

Method and apparatus for video encoding and decoding The application relates to a method and a device for encoding and decoding video in a 4:4:4 chroma format, which is applied for the patent application with the application number 202080050545.7 and is divided into patent applications with the application number 2020 and 23. RELATED APPLICATIONS The present application claims priority from U.S. provisional patent application No. 62/904,539 entitled "METHODS AND APPARATUS OF VIDEO CODING IN 4:4:4 CHROMA FORMAT," filed on date 23 at 9 in 2019, the entire contents of which are incorporated by reference. Technical Field The present application relates generally to video data codec and compression, and in particular, to a method and system for improving the codec efficiency of video. Background Digital video is supported by a variety of electronic devices such as digital televisions, laptop or desktop computers, tablet computers, digital cameras, digital recording devices, digital media players, video game consoles, smart phones, video teleconferencing devices, video streaming devices, and the like. The electronic device transmits, receives, encodes, decodes and/or stores digital video data by implementing video compression/decompression standards as defined by the MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4 (part 10, advanced Video Codec (AVC)), high Efficiency Video Codec (HEVC), and common video codec (VCC) standards. Video compression typically includes performing spatial (intra) prediction and/or temporal (inter) prediction to reduce or remove redundancy inherent in video data. For block-based video coding, a video frame is partitioned into one or more slices, each slice having a plurality of video blocks, which may also be referred to as Coding Tree Units (CTUs). Each CTU may contain one Coding Unit (CU), or be split recursively into smaller CUs until a predefined minimum CU size is reached. Each CU (also referred to as a leaf CU) contains one or more Transform Units (TUs), and each CU also contains one or more Prediction Units (PUs). Each CU may be encoded and decoded in intra, inter or IBC mode. Video blocks in an intra-coded (I) slice of a video frame are coded using spatial prediction with respect to reference samples in neighboring blocks within the same video frame. Video blocks in inter-coded (P or B) slices of a video frame may use spatial prediction with respect to reference samples in neighboring blocks within the same video frame, or use temporal prediction with respect to reference samples in other previous and/or future reference video frames. A prediction block for a current video block to be encoded is generated based on spatial or temporal prediction of a reference block (e.g., a neighboring block) that has been previously encoded. The process of finding the reference block may be accomplished by a block matching algorithm. Residual data representing pixel differences between a current block to be encoded and a prediction block is referred to as a residual block or prediction error. The inter-coded block is coded based on a residual block and a motion vector pointing to a reference block in a reference frame forming the prediction block. The process of determining motion vectors is commonly referred to as motion estimation. The intra-coded block is coded according to an intra-prediction mode and a residual block. For further compression, the residual block is transformed from the pixel domain to a transform domain (e.g., frequency domain), resulting in residual transform coefficients, which may then be quantized. The quantized transform coefficients, initially arranged in a two-dimensional array, may be scanned to produce a one-dimensional vector of transform coefficients, and then entropy encoded into the video bitstream to achieve even more compression. The encoded video bitstream is then stored in a computer readable storage medium (e.g., flash memory) for access by another electronic device having digital video capabilities or directly transmitted to the electronic device in a wired or wireless manner. The electronic device then performs video decompression (which is the reverse of the video compression described above) by, for example, parsing the encoded video bitstream to obtain syntax elements from the bitstream and reconstructing the digital video data from the encoded video bitstream into its original format based at least in part on the syntax elements obtained from the bitstream, and rendering the reconstructed digital video data on a display of the electronic device. As digital video quality goes from high definition to 4Kx2K or even 8Kx4K, the amount of video data to be encoded/decoded grows exponentially. This is a continuing challenge in terms of how video data can be encoded/decoded more efficiently while preserving the image quality of the decoded video data. Some video content (e.g., screen content video) is encoded in a 4:4:4 chroma format in which all three components (lum