EP-4740471-A1 - METHODS AND APPARATUS FOR INHERITING CROSS-COMPONENT MODELS FROM TEMPORAL AND HISTORY-BASED NEIGHBOURS FOR CHROMA INTER CODING

EP4740471A1EP 4740471 A1EP4740471 A1EP 4740471A1EP-4740471-A1

Abstract

A method and apparatus for coding colour pictures using coding tools including one or more cross component models related modes are disclosed. According to this method, one or more cross-component prediction candidates are determined based on one or more cross-component models inherited from one or more previously coded slices or pictures or from a current picture. A candidate list comprising said one or more cross-component prediction candidates is derived. The second-colour block is encoded or decoded by using the candidate list, wherein when a target cross-component prediction candidate is selected to code the second-colour block, prediction data for the second-colour block is generated by applying a corresponding cross-component model to the first-colour block.

Inventors

TSENG, HSIN-YI
CHIANG, Man-Shu
TSAI, CHIA-MING
CHUANG, CHENG-YEN
HSU, CHIH-WEI
CHEN, YI-WEN

Assignees

MEDIATEK INC.

Dates

Publication Date: 20260513
Application Date: 20240705

Claims (20)

A method of coding colour pictures using coding tools including one or more cross component models related modes, the method comprising: receiving input data associated with a current block comprising a first-colour block and a second-colour block, wherein the input data comprises pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side, and wherein the current block is coded in an inter mode or IBC (Intra Block Copy) mode; determining one or more cross-component prediction candidates based on one or more cross-component models inherited from one or more previously coded slices or pictures or from a current picture; deriving a candidate list comprising said one or more cross-component prediction candidates; and encoding or decoding the second-colour block by using the candidate list, wherein when a target cross-component prediction candidate is selected to code the second-colour block, prediction data for the second-colour block is generated by applying a corresponding cross-component model to the first-colour block.
The method of Claim 1, wherein target cross-component models are inherited from one or more collocated blocks in said one or more previously coded slices or pictures, and said one or more collocated blocks are indicated by inter mode information.
The method of Claim 2, wherein the collocated block is indicated by the inter mode information of the current block.
The method of Claim 2, wherein if the current block is coded in a subblock motion mode, one or more subblock temporal candidates corresponding to one or more subblock temporal cross-component models inherited from one or more collocated blocks indicated by the inter mode information of said one or more subblocks are added to the candidate list.
The method of Claim 2, wherein the collocated block is referred by the inter mode information of one or more neighbouring blocks of the current block.
The method of Claim 1, wherein said one or more cross-component prediction candidates are located at one or more pre-defined positions in said one or more previously coded slices or pictures according to current location of the current block, current block width, current block height, or a combination thereof.
The method of Claim 6, wherein said one or more pre-defined positions are inside a corresponding area of the current block or said one or more pre-defined positions are outside the corresponding area of the current block.
The method of Claim 6, wherein a first set of values and a second set of values are determined, and said one or more pre-defined positions comprise one or more offset locations from the current location of the current block, and wherein said one or more offset locations comprise the first set of values scaled by the current block width for a horizontal direction, the second set of values scaled by the current block height for a vertical direction, or both.
The method of Claim 1, wherein a collocated picture is determined, and wherein the collocated picture corresponds to a target previously coded picture that a target cross-component model is inherited from.
The method of Claim 9, wherein the collocated picture corresponds to one of reference pictures in one or more reference lists.
The method of Claim 9, wherein the collocated picture is selected according to a reference index and a target reference list signalled in or parsed from a picture header or a slice header.
The method of Claim 9, wherein the collocated picture is selected as a target reference picture in one or more reference lists, and POC (Picture Order Count) difference or QP (Quantization Parameter) difference between the target reference picture and a current picture is the smallest.
The method of Claim 9, wherein the collocated picture corresponds to a most recently coded I-picture.
The method of Claim 9, wherein both the collocated picture and positions of said one or more cross-component prediction candidates or only the positions of said one or more cross-component prediction candidates are determined according to a motion vector of a neighbouring block or the current block.
The method of Claim 14, wherein when the collocated picture corresponds to a target reference picture associated with the motion vector of the neighbouring block or the current block, the positions of said one or more cross-component prediction candidates are determined according to the motion vector of the neighbouring block or the current block shifted by a set of pre-defined values.
The method of Claim 14, wherein the positions of said one or more cross-component prediction candidates are determined according to a scaled motion vector shifted by a set of pre-defined values, and wherein the scaled motion vector is derived based on the motion vector of the neighbouring block scaled by a ratio of a first POC (Picture Order Count) distance for a current reference picture and a second POC distance for the collocated picture.
The method of Claim 14, wherein the neighbouring block is selected from a pre-defined position.
The method of Claim 17, wherein if the neighbouring block at the pre-defined position is not an inter block, the neighbouring block is not used to derive said one or more cross-component prediction candidates.
The method of Claim 14, wherein the neighbouring block is selected from a set of pre-defined positions according to a pre-defined checking order.
The method of Claim 19, wherein a first neighbouring block, according to the pre-defined checking order, having a corresponding reference picture being the collocated picture is selected as the neighbouring block.

Description

METHODS AND APPARATUS FOR INHERITING CROSS-COMPONENT MODELS FROM TEMPORAL AND HISTORY-BASED NEIGHBOURS FOR CHROMA INTER CODING CROSS REFERENCE TO RELATED APPLICATIONS The present invention is a non-Provisional Application of and claims priority to U.S. Provisional Patent Application No. 63/511, 922, filed on July 5, 2023. The U.S. Provisional Patent Application is hereby incorporated by reference in its entirety. FIELD OF THE INVENTION The present invention relates to video coding system. In particular, the present invention relates to cross-component prediction for a chroma component by inheriting temporal and/or history-based cross-component model. BACKGROUND AND RELATED ART Versatile video coding (VVC) is the latest international video coding standard developed by the Joint Video Experts Team (JVET) of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) . The standard has been published as an ISO standard: ISO/IEC 23090-3: 2021, Information technology -Coded representation of immersive media -Part 3: Versatile video coding, published Feb. 2021. VVC is developed based on its predecessor HEVC (High Efficiency Video Coding) by adding more coding tools to improve coding efficiency and also to handle various types of video sources including 3-dimensional (3D) video signals. Fig. 1A illustrates an exemplary adaptive Inter/Intra video encoding system incorporating loop processing. For Intra Prediction 110, the prediction data is derived based on previously coded video data in the current picture. For Inter Prediction 112, Motion Estimation (ME) is performed at the encoder side and Motion Compensation (MC) is performed based on the result of ME to provide prediction data derived from other picture (s) and motion data. Switch 114 selects Intra Prediction 110 or Inter Prediction 112 and the selected prediction data is supplied to Adder 116 to form prediction errors, also called residues. The prediction error is then processed by Transform (T) 118 followed by Quantization (Q) 120. The transformed and quantized residues are then coded by Entropy Encoder 122 to be included in a video bitstream corresponding to the compressed video data. The bitstream associated with the transform coefficients is then packed with side information such as motion and coding modes associated with Intra prediction and Inter prediction, and other information such as parameters associated with loop filters applied to underlying image area. The side information associated with Intra Prediction 110, Inter prediction 112 and in-loop filter 130, is provided to Entropy Encoder 122 as shown in Fig. 1A. When an Inter-prediction mode is used, a reference picture or pictures have to be reconstructed at the encoder end as well. Consequently, the transformed and quantized residues are processed by Inverse Quantization (IQ) 124 and Inverse Transformation (IT) 126 to recover the residues. The residues are then added back to prediction data 136 at Reconstruction (REC) 128 to reconstruct video data. The reconstructed video data may be stored in Reference Picture Buffer 134 and used for prediction of other frames. As shown in Fig. 1A, incoming video data undergoes a series of processing in the encoding system. The reconstructed video data from REC 128 may be subject to various impairments due to a series of processing. Accordingly, in-loop filter 130 is often applied to the reconstructed video data before the reconstructed video data are stored in the Reference Picture Buffer 134 in order to improve video quality. For example, deblocking filter (DF) , Sample Adaptive Offset (SAO) and Adaptive Loop Filter (ALF) may be used. The loop filter information may need to be incorporated in the bitstream so that a decoder can properly recover the required information. Therefore, loop filter information is also provided to Entropy Encoder 122 for incorporation into the bitstream. In Fig. 1A, Loop filter 130 is applied to the reconstructed video before the reconstructed samples are stored in the reference picture buffer 134. The system in Fig. 1A is intended to illustrate an exemplary structure of a typical video encoder. It may correspond to the High Efficiency Video Coding (HEVC) system, VP8, VP9, H. 264 or VVC. The decoder, as shown in Fig. 1B, can use similar or portion of the same functional blocks as the encoder except for Transform 118 and Quantization 120 since the decoder only needs Inverse Quantization 124 and Inverse Transform 126. Instead of Entropy Encoder 122, the decoder uses an Entropy Decoder 140 to decode the video bitstream into quantized transform coefficients and needed coding information (e.g. ILPF information, Intra prediction information and Inter prediction information) . The Intra prediction 150 at the decoder side does not need to perform the mode search. Instead, the decoder only needs to generate Intra prediction according to Intra prediction information received from the Entropy Decode