US-12627835-B2 - Methods and devices for multi-hypothesis-based prediction

US12627835B2US 12627835 B2US12627835 B2US 12627835B2US-12627835-B2

Abstract

Methods, apparatuses, and non-transitory computer-readable storage mediums are provided for video coding. In one method, a decoder obtains, for a current chroma block, a non-linear model (non-LM) mode and a linear model (LM) mode, and the LM mode includes a cross component linear model (CCLM) mode and a multi-model linear model (MMLM) mode. The decoder then combines the non-LM mode and the LM mode for a multi-hypothesis-based chroma prediction (MCP) for the current chroma block.

Inventors

HONG-JHENG JHU
Xiaoyu Xiu
Yi-Wen Chen
Wei Chen
Che-Wei KUO
Ning Yan
Han Gao
Xianglin Wang
Bing Yu

Assignees

Beijing Dajia Internet Information Technology Co., Ltd.

Dates

Publication Date: 20260512
Application Date: 20240809

Claims (18)

1 . A method for video decoding, comprising: obtaining, by a decoder, for a current chroma block, a non-linear model (non-LM) mode and a linear model (LM) mode, wherein the LM mode comprises a cross component linear model (CCLM) mode and a multi-model linear model (MMLM) mode; and combining, by the decoder, the non-LM mode and the LM mode for a multi-hypothesis-based chroma prediction (MCP) for the current chroma block, wherein the combining, by the decoder, the non-LM mode and the LM mode comprises: obtaining, by the decoder, a first predicted chroma sample by applying a template-based intra mode derivation (TIMD) mode; obtaining, by the decoder, a second prediction chroma sample by applying the CCLM mode or the MMLM mode; and obtaining, by the decoder, a final predictor for the current chroma block by calculating a weighted average of the first predicted chroma sample and the second predicted chroma sample.
2 . The method for video decoding of claim 1 , wherein obtaining the non-LM mode comprises: obtaining a template-based intra mode derivation (TIMD) chroma mode as the non-LM mode for the current chroma block, wherein the TIMD chroma mode is derived using a template comprising neighboring samples of the current chroma block.
3 . The method for video decoding of claim 1 , wherein obtaining the non-LM mode comprises: obtaining a cross-component TIMD mode as the non-LM mode for the current chroma block, wherein the cross-component TIMD mode is derived based on collocated reconstructed luma samples.
4 . The method for video decoding of claim 3 , further comprising: calculating, by the decoder, a sum of absolute transformed differences (SATD) between prediction samples and reconstruction samples of a template; and obtaining, by the decoder, an intra prediction mode with a minimum SATD as the cross-component TIMD mode for performing chroma intra prediction of the current chroma block.
5 . The method for video decoding of claim 4 , further comprising: in response to determining, by the decoder, that the intra prediction mode obtained from the cross-component TIMD mode is same as an intra prediction mode derived from a derived mode (DM), obtaining, by the decoder, an intra prediction mode with a second minimum SATD as the cross-component TIMD mode for performing chroma intra prediction of the current chroma block.
6 . The method for video decoding of claim 1 , wherein obtaining the non-LM mode comprises: obtaining a derived mode (DM) as the non-LM mode for the current chroma block, wherein the DM is derived based on collocated reconstructed luma samples.
7 . An apparatus, comprising: one or more processors; and a memory configured to store instructions executable by the one or more processors; wherein the one or more processors, upon execution of the instructions, are configured to perform acts comprising: obtaining for a current chroma block, a non-linear model (non-LM) mode and a linear model (LM) mode, wherein the LM mode comprises a cross component linear model (CCLM) mode and a multi-model linear model (MMLM) mode; and combining the non-LM mode and the LM mode for a multi-hypothesis-based chroma prediction (MCP) for the current chroma block, wherein the combining the non-LM mode and the LM mode comprises: obtaining a first predicted chroma sample by applying a template-based intra mode derivation (TIMD) mode; obtaining a second prediction chroma sample by applying the CCLM mode or the MMLM mode; and obtaining a final predictor for the current chroma block by calculating a weighted average of the first predicted chroma sample and the second predicted chroma sample.
8 . The apparatus of claim 7 , wherein obtaining the non-LM mode comprises: obtaining a template-based intra mode derivation (TIMD) chroma mode as the non-LM mode for the current chroma block, wherein the TIMD chroma mode is derived using a template comprising neighboring samples of the current chroma block.
9 . The apparatus of claim 7 , wherein obtaining the non-LM mode comprises: obtaining a cross-component TIMD mode as the non-LM mode for the current chroma block, wherein the cross-component TIMD mode is derived based on collocated reconstructed luma samples.
10 . The apparatus of claim 9 , wherein the one or more processors, upon execution of the instructions, are configured to perform acts further comprising: calculating a sum of absolute transformed differences (SATD) between prediction samples and reconstruction samples of a template; and obtaining an intra prediction mode with a minimum SATD as the cross-component TIMD mode for performing chroma intra prediction of the current chroma block.
11 . The apparatus of claim 10 , wherein the one or more processors, upon execution of the instructions, are configured to perform acts further comprising: in response to determining that the intra prediction mode obtained from the cross-component TIMD mode is same as an intra prediction mode derived from a derived mode (DM), obtaining an intra prediction mode with a second minimum SATD as the cross-component TIMD mode for performing chroma intra prediction of the current chroma block.
12 . The apparatus of claim 7 , wherein obtaining the non-LM mode comprises: obtaining a derived mode (DM) as the non-LM mode for the current chroma block, wherein the DM is derived based on collocated reconstructed luma samples.
13 . A non-transitory computer-readable storage medium, storing thereon a bitstream to be decoded by acts comprising: obtaining for a current chroma block, a non-linear model (non-LM) mode and a linear model (LM) mode, wherein the LM mode comprises a cross component linear model (CCLM) mode and a multi-model linear model (MMLM) mode; and combining the non-LM mode and the LM mode for a multi-hypothesis-based chroma prediction (MCP) for the current chroma block, wherein the combining the non-LM mode and the LM mode comprises: obtaining a first predicted chroma sample by applying a template-based intra mode derivation (TIMD) mode; obtaining a second prediction chroma sample by applying the CCLM mode or the MMLM mode; and obtaining a final predictor for the current chroma block by calculating a weighted average of the first predicted chroma sample and the second predicted chroma sample.
14 . The non-transitory computer-readable storage medium of claim 13 , wherein obtaining the non-LM mode comprises: obtaining a template-based intra mode derivation (TIMD) chroma mode as the non-LM mode for the current chroma block, wherein the TIMD chroma mode is derived using a template comprising neighboring samples of the current chroma block.
15 . The non-transitory computer-readable storage medium of claim 13 , wherein obtaining the non-LM mode comprises: obtaining a cross-component TIMD mode as the non-LM mode for the current chroma block, wherein the cross-component TIMD mode is derived based on collocated reconstructed luma samples.
16 . The non-transitory computer-readable storage medium of claim 15 , further comprising: calculating a sum of absolute transformed differences (SATD) between prediction samples and reconstruction samples of a template; and obtaining an intra prediction mode with a minimum SATD as the cross-component TIMD mode for performing chroma intra prediction of the current chroma block.
17 . The non-transitory computer-readable storage medium of claim 16 , further comprising: in response to determining that the intra prediction mode obtained from the cross-component TIMD mode is same as an intra prediction mode derived from a derived mode (DM), obtaining an intra prediction mode with a second minimum SATD as the cross-component TIMD mode for performing chroma intra prediction of the current chroma block.
18 . The non-transitory computer-readable storage medium of claim 13 , wherein obtaining the non-LM mode comprises: obtaining a derived mode (DM) as the non-LM mode for the current chroma block, wherein the DM is derived based on collocated reconstructed luma samples.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of PCT Patent Application No. PCT/US2023/012644, filed on Feb. 8, 2023, which claims priority to Provisional Applications No. 63/309,491, filed on Feb. 11, 2022. The entire disclosures of the above applications are incorporated herein by reference for all purposes. TECHNICAL FIELD This disclosure is related to video coding and compression. More specifically, this disclosure relates to methods and apparatus on improving the coding efficiency of the video blocks which applies multi-hypothesis-based prediction technology. BACKGROUND Digital video is supported by a variety of electronic devices, such as digital televisions, laptop or desktop computers, tablet computers, digital cameras, digital recording devices, digital media players, video gaming consoles, smart phones, video teleconferencing devices, video streaming devices, etc. The electronic devices transmit and receive or otherwise communicate digital video data across a communication network, and/or store the digital video data on a storage device. Due to a limited bandwidth capacity of the communication network and limited memory resources of the storage device, video coding may be used to compress the video data according to one or more video coding standards before it is communicated or stored. For example, video coding standards include Versatile Video Coding (VVC), Joint Exploration test Model (JEM), High-Efficiency Video Coding (HEVC/H.265), Advanced Video Coding (AVC/H.264), Moving Picture Expert Group (MPEG) coding, or the like. Video coding generally utilizes prediction methods (e.g., inter-prediction, intra-prediction, or the like) that take advantage of redundancy inherent in the video data. Video coding aims to compress video data into a form that uses a lower bit rate, while avoiding or minimizing degradations to video quality. SUMMARY Examples of the present disclosure provide methods and apparatus for video coding with an intra prediction coding mode. According to a first aspect of the present disclosure, a method for video decoding is provided. The method may include: obtaining, by a decoder, for a current chroma block, a non-linear model (non-LM) mode and a linear model (LM) mode, wherein the LM mode comprises a cross component linear model (CCLM) mode and a multi-model linear model (MMLM) mode; and combining, by the decoder, the non-LM mode and the LM mode for a multi-hypothesis-based chroma prediction (MCP) for the current chroma block. According to a second aspect of the present disclosure, an apparatus is provided. The apparatus includes: one or more processors; and a memory configured to store instructions executable by the one or more processors; wherein the one or more processors, upon execution of the instructions, are configured to perform acts including: obtaining for a current chroma block, a non-linear model (non-LM) mode and a linear model (LM) mode, wherein the LM mode comprises a cross component linear model (CCLM) mode and a multi-model linear model (MMLM) mode; and combining the non-LM mode and the LM mode for a multi-hypothesis-based chroma prediction (MCP) for the current chroma block. According to a third aspect of the present disclosure, a non-transitory computer-readable storage medium is provided. The non-transitory computer-readable storage medium stores thereon a bitstream to be decoded by acts including: obtaining for a current chroma block, a non-linear model (non-LM) mode and a linear model (LM) mode, wherein the LM mode comprises a cross component linear model (CCLM) mode and a multi-model linear model (MMLM) mode; and combining the non-LM mode and the LM mode for a multi-hypothesis-based chroma prediction (MCP) for the current chroma block. The above general descriptions and detailed descriptions below are only exemplary and explanatory and not intended to limit the present disclosure. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate examples consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure. FIG. 1 is a block diagram illustrating an exemplary system for encoding and decoding video blocks in accordance with some implementations of the present disclosure. FIG. 2 is a block diagram illustrating an exemplary video encoder in accordance with some implementations of the present disclosure. FIG. 3 is a block diagram illustrating an exemplary video decoder in accordance with some implementations of the present disclosure. FIGS. 4A through 4E are block diagrams illustrating how a frame is recursively partitioned into multiple video blocks of different sizes and shapes in accordance with some implementations of the present disclosure. FIG. 5 illustrates an example of the locations of samples involved in the CCLM mode in accordance with some implementations of the present discl