CN-122029812-A - Method and apparatus for syntax design of cross-component model in video coding and decoding system
Abstract
A method and apparatus for video encoding and decoding are disclosed. According to the method, for a current second color block encoded in a cross-component linear model correlation mode, a list of most probable linear model modes is generated, the list comprising one or more most probable linear model candidates selected from the list of candidates. Mode selection information associated with the most probable linear model mode list is signaled or parsed. And encoding or decoding the current second color block by using a target candidate selected from the most probable linear model mode list according to the mode selection information, wherein prediction data of the current second color block is generated by applying a current linear model related to the target candidate to the current first color block.
Inventors
- CAI JIAMING
- ZHUANG ZHENGYAN
- ZENG XINYI
- XU ZHIWEI
- CHEN QIWEN
- CHEN QINGYE
- ZHUANG ZIDE
Assignees
- 联发科技股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20241012
- Priority Date
- 20231013
Claims (20)
- 1. A video encoding and decoding method, comprising: Receiving input data associated with a current block, the current block comprising a current first color block and a current second color block, wherein the input data comprises pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side, and wherein the current second color block is encoded in a cross-component linear model correlation mode; Generating a list of most probable linear model patterns, the list of most probable linear model patterns comprising one or more most probable linear model candidates selected from a list of candidates; signaling or parsing mode selection information associated with the most probable linear model mode list, and Encoding or decoding the current second color block using a target candidate selected from the most probable linear model mode list according to the mode selection information, wherein prediction data of the current second color block is generated by applying a current linear model related to the target candidate to the current first color block.
- 2. The method of claim 1, wherein a first flag is signaled or parsed to indicate whether a current linear model mode is a candidate in the list of most probable linear model modes.
- 3. The method of claim 2, wherein when the first flag is true, further signaling a candidate index of the target candidate in the most probable linear model mode list.
- 4. The method of claim 2, wherein when the first flag is false, further indicating mode information related to the current linear model mode in an implicit or explicit manner.
- 5. The method of claim 4, wherein the pattern information comprises a linear model pattern type, a selection of single or multiple models, a model derivation pattern, or a combination thereof.
- 6. The method of claim 2, wherein at least one linear model mode candidate in the most probable linear model mode list is set to invalid or not allowed to be indicated when the most probable linear model mode list does not include all allowed linear model modes of the current block and the first flag is false.
- 7. The method of claim 1, wherein the list of most probable linear model patterns comprises one or more linear model patterns of one or more neighboring encoded blocks.
- 8. The method of claim 7, wherein the one or more neighboring encoded blocks comprise one or more neighboring locations, one or more non-neighboring locations, or both.
- 9. The method of claim 7, wherein the one or more linear model patterns of the one or more neighboring encoded blocks are sorted in a non-decreasing order after being counted.
- 10. The method of claim 1, wherein the most probable linear model pattern list comprises one or more neighboring locations, one or more non-neighboring locations, one or more temporal locations, one or more historical locations, or a combination thereof, located in a same CTU row or image.
- 11. The method of claim 1, wherein the list of most probable linear model patterns comprises one or more combined linear model pattern candidates, and each of the combined linear model pattern candidates is generated by combining a plurality of allowed linear model models of the current block.
- 12. The method of claim 11, wherein the plurality of allowed linear model models of the current block correspond to different model derivation modes, different selections between single models and multiple models, different linear model mode types, or a combination thereof.
- 13. The method of claim 11, wherein the most probable linear model mode list further comprises one or more linear models from one or more previously encoded blocks of the current block or from one or more previously encoded images.
- 14. The method of claim 1, wherein the candidate list is reordered according to a plurality of candidate indices prior to candidate reordering.
- 15. The method of claim 14, wherein when a maximum allowable size of the candidate list is N, the first m linear model candidates in the candidate list are not reordered, the remaining (N-m) candidates are reordered, where N and m are integers and 1+.m < N.
- 16. The method of claim 14, wherein a second flag is used to indicate whether the candidate reordering is applied, and a third flag is used to indicate whether the candidate list is in an inverted order.
- 17. The method of claim 14, wherein the candidate list is partitioned or classified into k groups, and the candidate reordering is applied to each of the k groups separately, where k is a positive integer and k > 1.
- 18. The method of claim 17, wherein the candidate list is partitioned or classified according to whether a model type is single or multi-model, a spatial candidate type, a temporal candidate type, a historical candidate type, a default candidate type, a spatial geometric distance, a temporal image order count distance between a corresponding candidate location and the current block, or a combination thereof.
- 19. The method of claim 17, wherein after reordering said each of said k groups, said candidate list is further reordered at said each of said k groups to reorder the positions of said k groups, wherein said candidate list is further reordered according to a minimum cost, an average cost, a median cost, or a maximum cost associated with said each candidate in said k groups.
- 20. An apparatus for video encoding, the apparatus comprising one or more electronic circuits or processors configured to: Receiving input data associated with a current block, the current block comprising a current first color block and a current second color block, wherein the input data comprises pixel data to be encoded at an encoder side or data associated with the current block to be decoded at a decoder side, and wherein the current second color block is encoded in a cross-component linear model correlation mode; Generating a list of most probable linear model patterns, the list of most probable linear model patterns comprising one or more most probable linear model candidates selected from a list of candidates; signaling or parsing mode selection information associated with the most probable linear model mode list, and Encoding or decoding the current second color block using a target candidate selected from the most probable linear model mode list according to the mode selection information, wherein prediction data of the current second color block is generated by applying a current linear model related to the target candidate to the current first color block.
Description
Method and apparatus for syntax design of cross-component model in video coding and decoding system Cross reference The present invention is a non-provisional application, U.S. provisional patent application No. 63/590,004 (filed on 13 of 10 months of 2023), and claims priority. The U.S. provisional patent application is incorporated herein by reference in its entirety. Technical Field The present invention relates to video coding and decoding. In particular, the present invention relates to a scheme for improving coding performance of a video codec system employing a Linear Model (LM) mode. Background Multifunctional video coding (VERSATILE VIDEO CODING; VVC) is the latest international video coding standard developed by the International telecommunication Union video coding experts group (ITU-T Video Coding Experts Group; VCEG) and the International organization for standardization/International electrotechnical Commission moving Picture experts group (ISO/IEC Moving Picture Experts Group; MPEG) Joint video experts group (Joint Video Experts Team; JVET). The standard has been published as an ISO standard, ISO/IEC 23090-3:2021, information technology-coded representation of immersive media-part 3-multifunctional video coding, published in 2021, month 2. VVC was developed on the basis of its front Highly Efficient Video Coding (HEVC), which improves Coding efficiency by adding more Coding tools and is capable of handling various types of Video sources including three-dimensional (3D) Video signals. Fig. 1A illustrates an example adaptive inter/intra video coding system that includes loop processing. For intra prediction, prediction data is derived based on previously encoded video data in the current picture. For inter prediction 112, motion estimation (Motion Estimation; ME) is performed at the encoder side and motion compensation (Motion Compensation; MC) is performed based on the results of ME to provide prediction data derived from other pictures and motion data. The switch 114 selects either intra prediction 110 or inter prediction 112, and the selected prediction data is provided to adder 116 to form a prediction error, also referred to as a residual. The prediction error is then processed by Transform (T) 118, followed by Quantization (Q) 120. The transformed and quantized residual is then encoded by entropy encoder 122 for inclusion in a video bitstream corresponding to the compressed video data. The bit stream associated with the transform coefficients is then packed with side information such as motion and coding modes associated with intra and inter prediction, and other information related to loop filters applied to the underlying image region. Side information related to intra prediction 110, inter prediction 112, and loop filter 130 is provided to entropy encoder 122 as shown in fig. 1A. When inter prediction modes are employed, the reference picture must also be reconstructed at the encoder side. Thus, the transformed and quantized residual is processed by inverse quantization (Inverse Quantization; IQ) 124 and inverse transformation (Inverse Transformation; IT) 126 to recover the residual. The residual is then added back to the prediction data 136 at Reconstruction (REC) 128 to reconstruct the video data. The reconstructed video data may be stored in a reference picture buffer 134 and used for prediction of other frames. As shown in fig. 1A, input video data undergoes a series of processes in an encoding system. The reconstructed video data from the REC 128 may be subject to various impairments due to a series of processing. Therefore, loop filter 130 is typically applied to the reconstructed video data to improve video quality before it is stored in reference image buffer 134. For example, a deblocking filter (Deblocking Filter; DF), a sample adaptive Offset (SAMPLE ADAPTIVE Offset; SAO), and an adaptive loop filter (Adaptive Loop Filter; ALF) may be used. Loop filter information may need to be included in the bitstream in order for the decoder to be able to correctly recover the required information. Thus, loop filter information is also provided to the entropy encoder 122 to incorporate the bitstream. In fig. 1A, loop filter 130 is applied to the reconstructed video before the reconstructed samples are stored in reference image buffer 134. The system in fig. 1A is intended to show an example structure of a typical video encoder. The system in fig. 1A may correspond to an efficient video coding (HEVC) system, VP8, VP9, h.264, or VVC. As shown in fig. 1B, the decoder may use the same or partially the same functional blocks as the encoder except for the transform 118 and quantization 120, as the decoder only needs to perform the inverse quantization 124 and inverse transform 126. The decoder uses the entropy decoder 140 in place of the entropy encoder 122 to decode the video bitstream into quantized transform coefficients and the required coding information (e.g., ILPF information, intra-pre