EP-4736448-A2 - METHOD, APPARATUS, AND MEDIUM FOR VIDEO PROCESSING

EP4736448A2EP 4736448 A2EP4736448 A2EP 4736448A2EP-4736448-A2

Abstract

Embodiments of the present disclosure provide a solution for video processing. A method for video processing is proposed. The method comprises: applying, for a conversion between a video unit of a video and a bitstream of the video unit, at least one of the followings in a cross-component prediction (CCP) model: a set of luma samples, a set of additional luma samples, a set of reconstructed chroma samples, or a set of predicted chroma samples; determining a prediction or reconstruction of the video unit by applying the CCP model to the video unit; and performing the conversion based on the prediction or reconstruction.

Inventors

ZHANG, KAI
ZHANG, LI

Assignees

ByteDance Inc.

Dates

Publication Date: 20260506
Application Date: 20240702

Claims (20)

I/We Claim: 1. A method of video processing, comprising: applying, for a conversion between a video unit of a video and a bitstream of the video unit, at least one of the followings in a cross-component prediction (CCP) model: a set of luma samples, a set of additional luma samples, a set of reconstructed chroma samples, or a set of predicted chroma samples; determining a prediction or reconstruction of the video unit by applying the CCP model to the video unit; and performing the conversion based on the prediction or reconstruction.
2. The method of claim 1, wherein luma positions used in the CCP model are after down- sampling for a colour format.
3. The method of claim 1, wherein the set of additional luma samples are at positions which are beyond a center position corresponding to a chroma sample to be predicted, a north position relative to the center position, a west position relative to the center position, an east position relative to the center position and a south position relative to the center position.
4. The method of claim 3, wherein the set of additional luma samples are at at least one of the following positions relative to the center position: a north west position, a north east position, a south west position, or a south east position.
5. The method of claim 4, wherein predChromaVal = c0C + c1N + c2S + c3E + c4W + c5P ++ c6NW + c7NE + c8SW + c9SE+ c10B, wherein predChromaVal represents a chroma sample to be predicted, C represents a luma sample at the center position, N represents a luma sample at the north position, S represents a luma sample at the south position, E represents a luma sample at the east position, W represents a luma sample at the west position, NW represents a luma sample at the north west position, NE represents a luma sample at the north east position, SW represents a luma sample at the south west position, SE represents a luma 106 F1242906PCT sample at the south east position, P and B represents nonlinear term and bias term, respectively, c 0 , c 1 , c 2 , c 3 , c 4 , c 5 , c 6 , c 7 , c 8 , c 9 and c 10 represent parameters.
6. The method of claim 5, wherein P=C 2 and/or B=1<<(bitdepth-1).
7. The method of claim 3, wherein the set of additional luma samples are at a position non-adjacent to the center position.
8. The method of claim 7, wherein predChromaVal = c0C + c1N + c2S + c3E + c4W + c 5 P ++ c 6 NW + c 7 NE + c 8 SW + c 9 SE+ c 10 N2 + c 11 S2 + c 12 E2 + c 13 W2 + c 14 B, wherein predChromaVal represents a chroma sample to be predicted, C represents a luma sample at the center position, N represents a luma sample at the north position, S represents a luma sample at the south position, E represents a luma sample at the east position, W represents a luma sample at the west position, NW represents a luma sample at the north west position, NE represents a luma sample at the north east position, SW represents a luma sample at the south west position, SE represents a luma sample at the south east position, N2, S2, E2 and W2 represent luma samples at non-adjacent positions to the center position, respectively, P and B represents nonlinear term and bias term, respectively, c 0 , c 1 , c 2 , c 3 , c 4 , c 5 , c 6 , c 7 , c 8 , c 9 , c 10 , c 11 , c 12 , c 13 and c14 represent parameters.
9. The method of claim 8, wherein P=C 2 and/or B=1<<(bitdepth-1).
10. The method of claim 1, wherein if one or more luma samples are unavailable, padding is used to obtain the set of additional luma samples.
11. The method of claim 1, wherein a function with at least one input as an additional luma sample is involved in the CCP model.
12. The method of claim 11, wherein the function is a derivation of a gradient.
13. The method of claim 1, wherein the set of luma samples and the set of additional luma samples are obtained with down-sampling approaches same as those used in one of: cross- component linear model (CCLM), convolutional cross-component model (CCCM), gradient linear model (GLM), or CCCM using multiple downsampling filters (MF-CCCM). 107 F1242906PCT
14. The method of claim 1, wherein the set of luma samples and the set of additional luma samples are obtained with down-sampling approaches different from those used in one of: CCLM, CCCM, GLM, or MF-CCCM.
15. The method of claim 1, wherein in response to the video unit is coded with a first target coding mode, the set of luma samples and the set of additional luma samples are applied in the CCP model.
16. The method of claim 15, wherein the first target coding mode is an extended CCCM (extCCCM) mode.
17. The method of claim 15, wherein the first target coding mode has variants is a same way of CCCM.
18. The method of claim 17, wherein the variants of the first target coding mode comprise at least one of: multiple model- extCCCM (MM-extCCCM), extCCCM-left (extCCCM-L), extCCCM-top (extCCCM-T), MM-extCCCM-L, or MM-extCCCM-T.
19. The method of claim 15, wherein the first target coding mode and variants of the first target coding mode replace an exiting mode.
20. The method of claim 19, wherein the existing mode comprises CCCM and/or variants of the CCCM.

Description

METHOD, APPARATUS, AND MEDIUM FOR VIDEO PROCESSING FIELDS [0001] Embodiments of the present disclosure relates generally to video processing techniques, and more particularly, to extended cross-component prediction. BACKGROUND [0002] In nowadays, digital video capabilities are being applied in various aspects of peoples’ lives. Multiple types of video compression technologies, such as MPEG -2, MPEG-4, ITU-TH.263, ITU-TH.264/MPEG-4 Part 10 Advanced Video Coding (AVC), ITU-TH.265 high efficiency video coding (HEVC) standard, versatile video coding (VVC) standard, have been proposed for video encoding/decoding. However, coding efficiency of video coding techniques is generally expected to be further improved. SUMMARY [0003] Embodiments of the present disclosure provide a solution for video processing. [0004] In a first aspect, a method for video processing is proposed. The method comprises: applying, for a conversion between a video unit of a video and a bitstream of the video unit, at least one of the followings in a cross-component prediction (CCP) model: a set of luma samples, a set of additional luma samples, a set of reconstructed chroma samples, or a set of predicted chroma samples; determining a prediction or reconstruction of the video unit by applying the CCP model to the video unit; and performing the conversion based on the prediction or reconstruction. In this way, it can improve coding efficiency and coding performance. [0005] In a second aspect, an apparatus for video processing is proposed. The apparatus comprises a processor and a non-transitory memory with instructions thereon. The instructions upon execution by the processor, cause the processor to perform a method in accordance with the first aspect of the present disclosure. [0006] In a third aspect, a non-transitory computer-readable storage medium is proposed. The non-transitory computer-readable storage medium stores instructions that 1 F1242906PCT cause a processor to perform a method in accordance with the first aspect of the present disclosure. [0007] In a fourth aspect, another non-transitory computer-readable recording medium is proposed. The non-transitory computer-readable recording medium stores a bitstream of a video which is generated by a method performed by an apparatus for video processing. The method comprises: applying at least one of the followings in a cross-component prediction (CCP) model: a set of luma samples, a set of additional luma samples, a set of reconstructed chroma samples, or a set of predicted chroma samples; determining a prediction or reconstruction of a video unit of the video by applying the CCP model to the video unit; and generating the bitstream based on the prediction or reconstruction. [0008] In a fifth aspect, a method for storing a bitstream of a video is proposed. The method comprises: applying at least one of the followings in a cross-component prediction (CCP) model: a set of luma samples, a set of additional luma samples, a set of reconstructed chroma samples, or a set of predicted chroma samples; determining a prediction or reconstruction of a video unit of the video by applying the CCP model to the video unit; generating the bitstream based on the prediction or reconstruction; and storing the bitstream in a non-transitory computer-readable recording medium. [0009] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. BRIEF DESCRIPTION OF THE DRAWINGS [0010] Through the following detailed description with reference to the accompanying drawings, the above and other objectives, features, and advantages of example embodiments of the present disclosure will become more apparent. In the example embodiments of the present disclosure, the same reference numerals usually refer to the same components. [0011] Fig. 1 illustrates a block diagram that illustrates an example video coding system, in accordance with some embodiments of the present disclosure; [0012] Fig. 2 illustrates a block diagram that illustrates a first example video encoder, in accordance with some embodiments of the present disclosure; 2 F1242906PCT [0013] Fig. 3 illustrates a block diagram that illustrates an example video decoder, in accordance with some embodiments of the present disclosure; [0014] Fig. 4 illustrates nominal vertical and horizontal locations of 4:2:2 luma and chroma samples in a picture; [0015] Fig. 5 illustrates an example of encoder block diagram; [0016] Fig. 6 illustrates 67 intra prediction modes; [0017] Fig. 7 illustrates reference samples for wide-angular intra prediction; [0018] Fig. 8 illustrates problem of discontinuity in case of directions beyond 45° ; [0019] Fig. 9 illustrates locations of the samples used for the derivation