US-20260129218-A1 - METHOD, APPARATUS, AND MEDIUM FOR VIDEO PROCESSING
Abstract
Embodiments of the disclosure provide a solution for video processing. A method for video processing is proposed. The method includes: determining, for a conversion between a video unit of a video and a bitstream of the video, that one or more parameters of a cross-component residual model (CCRM) of the video unit is inherited from a previous CCRM coded block; and performing the conversion based on the CCRM.
Inventors
- Zhipin Deng
- Kai Zhang
- Li Zhang
Assignees
- Douyin Vision Co., Ltd.
- BYTEDANCE INC.
Dates
- Publication Date
- 20260507
- Application Date
- 20251230
- Priority Date
- 20230630
Claims (20)
- 1 . A method for video processing, comprising: determining, for a conversion between a video unit of a video and a bitstream of the video, that one or more parameters of a cross-component residual model (CCRM) of the video unit is inherited from a previous CCRM coded block; and performing the conversion based on the CCRM.
- 2 . The method of claim 1 , wherein at least one syntax element is signalled at a video unit level to specify whether and/or how to use CCRM model inheritance mode.
- 3 . The method of claim 2 , wherein the at least one syntax element is signaled at one of: block level, transform unit (TU) level, prediction unit (PU) level, or coding unit (CU) level, and/or wherein an indicator to specify whether the video unit uses regular CCRM mode or CCRM model inheritance mode is signalled at video unit level, and/or wherein the indicator to specify whether the video unit uses regular CCRM mode or CCRM model inheritance mode is signalled based on a condition associated with at least one type of CCRM used for the video unit, and/or wherein a first syntax is signalled to indicate that the video unit uses a type of CCRM mode, and a second syntax is further signalled to indicate that which type of CCRM mode is used, and/or wherein if CCRM model inheritance mode is used, another syntax is further signalled to specify which CCRM model candidate is selected to be inherited, and/or wherein an indicator is signalled at video unit level to specify at least one of: whether the video unit uses CCRM model inheritance mode or which candidate is used for CCRM model inheritance mode.
- 4 . The method of claim 3 , wherein the second syntax is signalled, if there is at least one available CCRM candidate.
- 5 . The method of claim 1 , wherein if the CCRM model inheritance mode is used, a CCRM model candidate list is generated.
- 6 . The method of claim 5 , wherein a maximum length of list size is pre-defined in the bitstream, and/or wherein a CCRM model candidate is obtained based on previously coded CCRM blocks, and/or wherein at least one of pruning, redundancy, or similarity check is applied to CCRM candidate list construction, and/or wherein a CCRM candidate reordering is applied to the CCRM model candidate list.
- 7 . The method of claim 6 , wherein the list size is equal to 6 or 10 or 12, and/or wherein a size of history table is pre-defined, and/or wherein the previously coded CCRM blocks are at least one of: spatial adjacent neighbors, temporal candidate, spatial non-adjacent neighbors, history based CCRM candidate, shifted candidates, or default CCRM candidate, and/or wherein a candidate inserting order follows a pre-defined rule, and/or wherein a CCRM model candidate is checked at subblock granularity, and/or wherein a pre-defined scatter check order is used for checking CCRM model candidates, and/or wherein positions of non-adjacent neighbor blocks are based on block dimensions of the video unit, and/or wherein a motion shift is used to locate temporal candidates, and/or wherein one or more history based CCRM candidates are from a first-in-first-out history table, and/or wherein if a to-be-inserted candidate is different from specified candidates already in a CCRM candidate list, the to-be-inserted candidate is inserted to the CCRM candidate list, and/or wherein at least one of: same pruning, same redundancy or same similarity check rule is applied to all types of CCRM candidates, and/or wherein at least one of: different pruning, different redundancy or different similarity check rules is applied to different types of CCRM candidates, and/or wherein CCRM candidates in the CCRM model candidate list are sorted based on a decoder derived cost, and/or wherein a cost is derived based on applying the CCRM candidate to a reference region or reference block of the video unit, and/or wherein a cost is derived based on applying a CCRM candidate to a neighboring region or neighboring block of the video unit, and/or wherein based on decoder derived costs, CCRM model candidates are sorted from lowest cost to highest cost, and a CCRM model candidate with minimum cost is ordered at the first of the CCRM model candidate list.
- 8 . The method of claim 7 , wherein the size of the history table is equal to 5 or 6, and/or wherein the candidate inserting order is a sequence as follows: spatial adjacent, temporal, spatial non-adjacent, history, shifted, default, and/or wherein each continuous subblock within a pre-defined region is checked, and/or wherein the positions are with a distance from the video unit, wherein the distance is proportional to a width and/or height of the video unit, and/or wherein the motion shift is based on a motion vector of a neighbor block, and/or wherein the temporal candidates are from a collocated picture, and/or wherein the temporal candidates are from a reference picture which is not a collocated picture, and/or wherein the first-in-first-out history table is initialized at one of: tile level, coding tree unit (CTU) level, row level, slice level, or picture level, and/or wherein the specified candidates refer to all available CCRM candidates in the CCRM candidate list, and/or wherein the specified candidates refer to one or more CCRM candidates in the CCRM candidate list, and/or wherein the reference region or reference block is identified by a motion vector of a current block, and/or wherein for each CCRM candidate, a CCRM model is firstly applied to a reference luma to get a predicted reference chroma, and then the cost is computed as an absolute difference between a true reference chroma and the predicted reference chroma, and/or wherein the neighboring region or neighboring block is above neighbors of the video unit or left neighbors of the video unit, and/or wherein for each CCRM candidate, a CCRM model is firstly applied to a neighboring luma to get a predicted neighboring chroma, and then the cost is computed as an absolute difference between a true neighboring chroma and the predicted neighboring chroma.
- 9 . The method of claim 8 , wherein all 4×4 subblocks above and left to the video unit are checked, and/or wherein the one or more CCRM candidates comprise at least one of: a last CCRM candidate in the CCRM candidate list or last X CCRM candidates in the CCRM candidate list, wherein X is a pre-defined constant.
- 10 . The method of claim 1 , wherein if CCRM model inheritance mode is used, a target CCRM candidate model is directly applied to the video unit without model estimation, and/or wherein a CCRM model of a CCRM coded block is stored in a buffer.
- 11 . The method of claim 10 , wherein a first candidate of the CCRM model candidate list is used for the CCRM model inheritance mode, and/or wherein which candidate of the CCRM model candidate list used to a block which is CCRM model inheritance mode coded is signalled in the bitstream, and/or wherein for a reconstructed reordered intra block copy (RRIBC) block coded with the CCRM mode inheritance mode, an inherited CCRM model is applied based on an inherited RRIBC flip type.
- 12 . The method of claim 11 , wherein if the inherited RRIBC flip type indicates that the inherited CCRM model is from a RRIBC coded block, one or more CCRM filter taps for the video unit are flipped according to the inherited RRIBC flip type.
- 13 . The method of claim 12 , wherein the inherited RRIBC flip type is non-zero, and/or wherein during generating the CCRM filter taps from non-downsampled luma samples, filter taps of non-downsampled luma samples are swapped or flipped.
- 14 . The method of claim 11 , wherein if an 8-tap CCRM model comprises 6 spatial luma samples, a nonlinear term, and a bias term, spatial luma samples are obtained from a luma grid selecting 6 luma samples closest to a chroma position without down sampling, and a predicted chroma value is obtained as predChromaVal=c0 L0+c1L1+c2L2+c3L3+c4L4+c5L5+c6 nonlinear((L0+L3+1)>>1)+c7 B, wherein L0, L1, L2, L3, L4 and L5 represent spatial luma samples, respectively, c0, c1, c2, c3, c4, c5, c6 and c7 represent coefficients, respectively, nonlinear represents CCCM's nonlinear operator and B represents a bias.
- 15 . The method of claim 14 , wherein if the inherited CCRM model is from a horizontal flipped RRIBC coded block, during applying the inherited CCRM model to the video unit, luma samples L1 and L2 are swapped and luma samples L4 and L5 are swapped, and/or wherein if the inherited CCRM model is from a vertical flipped RRIBC coded block, during applying the inherited CCRM model to the video unit, luma samples L1 and L4 are swapped, luma samples L0 and L3 are swapped, and luma samples L2 and L5 are swapped.
- 16 . The method of claim 10 , wherein stored CCRM model information includes at least one of the following information: CCRM model coefficients or taps or parameters of Cb and Cr components, respectively, middle value of the CCRM coded block, bit depth of the CCRM coded block, an offset value of Y component of the CCRM coded block, an offset values of U and/or V components of the CCRM coded block, or a RRIBC flip type of the CCRM coded block.
- 17 . The method of claim 1 , wherein the conversion includes encoding the video unit into the bitstream, or wherein the conversion includes decoding the video unit from the bitstream.
- 18 . An apparatus for video processing comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to perform acts comprising: determining, for a conversion between a video unit of a video and a bitstream of the video, that one or more parameters of a cross-component residual model (CCRM) of the video unit is inherited from a previous CCRM coded block; and performing the conversion based on the CCRM.
- 19 . A non-transitory computer-readable storage medium storing instructions that cause a processor to perform acts comprising: determining, for a conversion between a video unit of a video and a bitstream of the video, that one or more parameters of a cross-component residual model (CCRM) of the video unit is inherited from a previous CCRM coded block; and performing the conversion based on the CCRM.
- 20 . A non-transitory computer-readable recording medium storing a bitstream of a video which is generated by a method performed by an apparatus for video processing, wherein the method comprises: determining that one or more parameters of a cross-component residual model (CCRM) of a video unit of the video is inherited from a previous CCRM coded block; and generating the bitstream of the video unit based on the CCRM.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of International Application No. PCT/CN2024/101748, filed on Jun. 26, 2024, which claims the benefit of International Application No. PCT/CN2023/105010, filed on Jun. 30, 2023. The entire contents of these applications are hereby incorporated by reference in their entireties. FIELDS Embodiments of the present disclosure relates generally to video processing techniques, and more particularly, to cross component model for residual coding. BACKGROUND In nowadays, digital video capabilities are being applied in various aspects of peoples lives. Multiple types of video compression technologies, such as MPEG-2, MPEG-4, ITU-TH.263, ITU-TH.264/MPEG-4 Part 10 Advanced Video Coding (AVC), ITU-TH.265 high efficiency video coding (HEVC) standard, versatile video coding (VVC) standard, have been proposed for video encoding/decoding. However, there are several issues in conventional video coding, which is undesirable. Therefore, the coding gain of conventional video coding techniques is generally expected to be further improved. SUMMARY Embodiments of the present disclosure provide a solution for video processing. In a first aspect, a method for video processing is proposed. The method comprises: determining, for a conversion between a video unit of a video and a bitstream of the video, that one or more parameters of a cross-component residual model (CCRM) of the video unit is inherited from a previous CCRM coded block; and performing the conversion based on the CCRM. In this way, it can improving coding efficiency and coding performance. In a second aspect, an apparatus for video processing is proposed. The apparatus comprises a processor and a non-transitory memory with instructions thereon. The instructions upon execution by the processor, cause the processor to perform a method in accordance with the first aspect of the present disclosure. In a third aspect, a non-transitory computer-readable storage medium is proposed. The non-transitory computer-readable storage medium stores instructions that cause a processor to perform a method in accordance with the first aspect of the present disclosure. In a fourth aspect, another non-transitory computer-readable recording medium is proposed. The non-transitory computer-readable recording medium stores a bitstream of a video which is generated by a method performed by an apparatus for video processing. The method comprises: determining that one or more parameters of a cross-component residual model (CCRM) of a video unit of the video is inherited from a previous CCRM coded block; and generating the bitstream of the video unit based on the CCRM. In a fifth aspect, a method for storing a bitstream of a video is proposed. The method comprises: determining that one or more parameters of a cross-component residual model (CCRM) of a video unit of the video is inherited from a previous CCRM coded block: generating the bitstream of the video unit based on the CCRM; and storing the bitstream in a non-transitory computer-readable recording medium. This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. BRIEF DESCRIPTION OF THE DRAWINGS Through the following detailed description with reference to the accompanying drawings, the above and other objectives, features, and advantages of example embodiments of the present disclosure will become more apparent. In the example embodiments of the present disclosure, the same reference numerals usually refer to the same components. FIG. 1 illustrates a block diagram that illustrates an example video coding system, in accordance with some embodiments of the present disclosure; FIG. 2 illustrates a block diagram that illustrates a first example video encoder, in accordance with some embodiments of the present disclosure; FIG. 3 illustrates a block diagram that illustrates an example video decoder, in accordance with some embodiments of the present disclosure; FIG. 4 illustrates an illustration of the effect of the slope adjustment parameter “u” where model created with the current CCLM is shown on the left and model updated as proposed is shown on the right; FIG. 5 illustrates neighboring blocks (L, A, BL, AR, AL) used in the derivation of a general MPM list; FIG. 6 illustrates neighboring reconstructed samples used for DIMD chroma mode; FIG. 7 illustrates intra template matching search area used; FIG. 8 illustrates the use of IntraTMP block vector for IBC block; FIG. 9 illustrates the division method for angular modes; FIG. 10 illustrates extended MRL candidate list; FIG. 11 illustrates an illustration of the template area; FIG. 12 illustrates spatial part of the convolutional filter; FIG. 13 illustrates