EP-4740473-A1 - METHOD AND APPARATUS FOR CONSTRUCTING CANDIDATE LIST FOR INHERITING NEIGHBORING CROSS-COMPONENT MODELS FOR CHROMA INTER CODING

EP4740473A1EP 4740473 A1EP4740473 A1EP 4740473A1EP-4740473-A1

Abstract

Methods and apparatus for video decoding are disclosed. According to this method, input data associated with a current block of a current image of a video is received, and wherein the current block is coded in a non-intra mode. A candidate list corresponding to the block information is constructed, wherein the candidate list comprises cross-component models, and the cross-component models comprise at least one self-derived cross-component model or at least one temporal candidate generated according to at least one motion vector of the current block. A selected model from the candidate list is selected. The current block based on the selected model is reconstructed.

Inventors

TSENG, HSIN-YI
CHIANG, Man-Shu
TSAI, CHIA-MING
CHUANG, CHENG-YEN
HSU, CHIH-WEI
CHEN, YI-WEN

Assignees

MediaTek Inc.

Dates

Publication Date: 20260513
Application Date: 20240705

Claims (17)

A video decoding method, comprising: receiving input data associated with a current block of a current image of a video, and wherein the current block is coded in a non-intra mode; constructing a candidate list corresponding to the current block, wherein the candidate list comprises cross-component models, and the cross-component models comprise at least one self-derived cross-component model or at least one candidate generated according to motion information of the current block; selecting one or more selected models from the candidate list; and reconstructing the current block based on the one or more selected models.
The video decoding method of claim 1, wherein the motion information of the current block is a motion vector or a block vector of the current block.
The video decoding method of claim 1, wherein the video decoding method further comprising: generating chroma prediction of the current block from luma information of the current block based on the one or more selected models to reconstruct the current block.
The video decoding method of claim 1, wherein the cross-component models further comprise inherited cross-component models.
The video decoding method of claim 4, wherein the inherited cross-component models comprise at least one of spatial model, temporal model, history-based model, pairwise average model and default model.
The video decoding method of claim 1, wherein the at least one self-derived cross-component model is CCRM.
The video decoding method of claim 1, wherein the self-derived cross-component model is derived through a weight derivation process, wherein the weight derivation process comprises calculating a relationship weight between a target chroma prediction and at least one of one or more source terms from luma component, one or more source terms from chroma components, and one or more bias terms.
The video decoding method of claim 1, wherein the method further comprises a candidate list modification process.
The video decoding method of claim 8, wherein the candidate list modification process comprises a reordering process, wherein the reordering process comprises a reordering rule for reordering the cross-component models in the candidate list.
The video decoding method of claim 9, wherein the reordering rule is based on model error calculated by computing difference between prediction generated by applying each of the cross-component models to neighboring templates of the current block, and reconstruction of the neighboring template.
The video decoding method of claim 10, wherein the difference is calculated using Sum of Absolute Difference (SAD) .
The video decoding method of claim 10, the cross-component model with the smallest model error are selected to reconstruct the current block.
The video decoding method of claim 8, wherein the candidate list modification process comprises a pruning process, wherein the pruning process comprises determining whether to include a new cross-component model into the candidate list by calculating a similarity between the new cross-component model and the cross-component models in the candidate list or by calculating a similarity between the new cross-component model and another self-derived model.
The video decoding method of claim 13, wherein the pruning process calculates the similarity based on the difference between the model parameters of two models.
The video decoding method of claim 14, if the similarity is smaller than or equal to a threshold, the new cross-component model is not included in the candidate list.
A video encoding method, comprising: receiving input data associated with a current block of a current image of a video, and wherein the current block is coded in a non-intra mode; constructing a candidate list corresponding to the current block, wherein the candidate list comprises cross-component models, and the cross-component models comprise at least one self-derived cross-component model or at least one temporal candidate generated according to at least one motion vector of the current block; selecting one or more selected models from the candidate list; and encoding chroma information of the current block from luma information of the current block based on the one or more selected models.
A video decoding apparatus, comprising: a processer, which is configured to: receive input data associated with a current block of a current image of a video, and wherein the current block is coded in a non-intra mode; construct a candidate list corresponding to the current block, wherein the candidate list comprises cross-component models, and the cross-component models comprise at least one self-derived cross-component model or at least one temporal candidate generated according to at least one motion vector of the current block; select one or more selected models from the candidate list; and reconstruct the current block based on the one or more selected models.

Description

METHOD AND APPARATUS FOR CONSTRUCTING CANDIDATE LIST FOR INHERITING NEIGHBORING CROSS-COMPONENT MODELS FOR CHROMA INTER CODING FIELD OF THE INVENTION The present invention relates to video coding system. In particular, the present invention relates to construct a candidate list for inheriting neighboring cross-component models for chroma inter coding. BACKGROUND Versatile video coding (VVC) is the latest international video coding standard developed by the Joint Video Experts Team (JVET) of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) . The standard has been published as an ISO standard: ISO/IEC 23090-3: 2021, Information technology-Coded representation of immersive media-Part 3: Versatile video coding, published Feb. 2021. VVC is developed based on its predecessor HEVC (High Efficiency Video Coding) by adding more coding tools to improve coding efficiency and also to handle various types of video sources including 3-dimensional (3D) video signals. Fig. 1A illustrates an exemplary adaptive Inter/Intra video encoding system incorporating loop processing. For Intra Prediction 110, the prediction data is derived based on previously encoded video data in the current picture. For Inter Prediction 112, Motion Estimation (ME) is performed at the encoder side and Motion Compensation (MC) is performed based on the result of ME to provide prediction data derived from other picture (s) and motion data. Switch 114 selects Intra Prediction 110 or Inter-Prediction 112 and the selected prediction data is supplied to Adder 116 to form prediction errors, also called residues. The prediction error is then processed by Transform (T) 118 followed by Quantization (Q) 120. The transformed and quantized residues are then encoded by Entropy Encoder 122 to be included in a video bitstream corresponding to the compressed video data. The bitstream associated with the transform coefficients is then packed with side information such as motion and coding modes associated with Intra prediction and Inter prediction, and other information such as parameters associated with loop filters applied to underlying image area. The side information associated with Intra Prediction 110, Inter prediction 112 and in-loop filter 130, are provided to Entropy Encoder 122 as shown in Fig. 1A. When an Inter-prediction mode is used, a reference picture or pictures have to be reconstructed at the encoder end as well. Consequently, the transformed and quantized residues are processed by Inverse Quantization (IQ) 124 and Inverse Transformation (IT) 126 to recover the residues. The residues are then added back to prediction data 136 at Reconstruction (REC) 128 to reconstruct video data. The reconstructed video data may be stored in Reference Picture Buffer 134 and used for prediction of other frames. As shown in Fig. 1A, incoming video data undergoes a series of processing in the encoding system. The reconstructed video data from REC 128 may be subject to various impairments due to a series of processing. Accordingly, in-loop filter 130 is often applied to the reconstructed video data before the reconstructed video data are stored in the Reference Picture Buffer 134 in order to improve video quality. For example, de-blocking filter (DF) , Sample Adaptive Offset (SAO) and Adaptive Loop Filter (ALF) may be used. The loop filter information may need to be incorporated in the bitstream so that a decoder can properly recover the required information. Therefore, loop filter information is also provided to Entropy Encoder 122 for incorporation into the bitstream. In Fig. 1A, Loop filter 130 is applied to the reconstructed video before the reconstructed samples are stored in the reference picture buffer 134. The system in Fig. 1A is intended to illustrate an exemplary structure of a typical video encoder. It may correspond to the High Efficiency Video Coding (HEVC) system, VP8, VP9, H. 264, VVC or any other video coding standard. The decoder, as shown in Fig. 1B, can use similar or portion of the same functional blocks as the encoder except for Transform 118 and Quantization 120 since the decoder only needs Inverse Quantization 124 and Inverse Transform 126. Instead of Entropy Encoder 122, the decoder uses an Entropy Decoder 140 to decode the video bitstream into quantized transform coefficients and needed coding information (e.g. ILPF information, Intra prediction information and Inter prediction information) . The Intra prediction 150 at the decoder side does not need to perform the mode search. Instead, the decoder only needs to generate Intra prediction according to Intra prediction information received from the Entropy Decoder 140. Furthermore, for Inter prediction, the decoder only needs to perform motion compensation (MC 152) according to Inter prediction information received from the Entropy Decoder 140 without the need for motion estimation. According to VVC, an input picture is partitioned into non-