US-20260129216-A1 - METHOD, APPARATUS, AND MEDIUM FOR VIDEO PROCESSING

US20260129216A1US 20260129216 A1US20260129216 A1US 20260129216A1US-20260129216-A1

Abstract

Embodiments of the disclosure provide a solution for video processing. A method for video processing is proposed. The method includes: determining, for a conversion between a video unit of a video and a bitstream of the video, whether a target coding mode is applied to the video unit based on coding information or an indication, and wherein in the target coding mode, a refined prediction sample of a prediction sample in the video unit is derived by applying a function used in local illumination compensation (LIC) to the prediction sample, one or more parameters of the function are modified; and performing the conversion based on the determining.

Inventors

Yang Wang
Kai Zhang
Yuwen He
Hongbin Liu
Li Zhang

Assignees

Douyin Vision Co., Ltd.
BYTEDANCE INC.

Dates

Publication Date: 20260507
Application Date: 20251229
Priority Date: 20230629

Claims (20)

1 . A method for video processing, comprising: determining, for a conversion between a video unit of a video and a bitstream of the video, whether a target coding mode is applied to the video unit based on coding information or an indication, and wherein in the target coding mode, a refined prediction sample of a prediction sample in the video unit is derived by applying a function used in local illumination compensation (LIC) to the prediction sample, one or more parameters of the function are modified; and performing the conversion based on the determining.
2 . The method of claim 1 , wherein whether to and/or how to apply the target coding mode depends on the coding information.
3 . The method of claim 1 , wherein the coding information comprises at least one of: whether a coding approach is allowed, block dimensions and/or block size, block depth, slice type, picture type, partition tree type, temporal layer identification, block location, colour format, or colour component.
4 . The method of claim 3 , wherein a block is disallowed to be coded with the target coding mode, if the block size is larger than or equal to a first threshold, wherein the block size is represented as W×H, W and H denote block width and block height, respectively, and/or wherein a block is disallowed to be coded with target coding mode, if the block size is less than or equal to a second threshold, wherein the block size is represented as W×H, W and H denote block width and block height, respectively, and/or wherein a block is disallowed to be coded with target coding mode, if a block width is larger than or equal to a third threshold and/or a block height is larger than or equal to a fourth threshold, and/or wherein a block is disallowed to be coded with target coding mode, if a block width is less than or equal to a fifth threshold and/or a block height is less than or equal to a sixth threshold, and/or wherein a block is disallowed to be coded with target coding mode, if a ratio of block width to block heigh or a ratio of block height to block width is larger than or equal to a seventh threshold, and/or wherein the blocks size is luma block size, or the block size is chroma block size, and/or wherein a block is disallowed to be coded with target coding mode, if the temporal layer identification is less than or equal to TidTh1, and/or wherein a block is disallowed to be coded with target coding mode, if the temporal layer identification is larger than or equal to TidTh2, and/or wherein if a block locates at a left-top boundary of a picture or slice or tile, it is disallowed to use target coding mode for the block.
5 . The method of claim 4 , wherein the first threshold is equal to 64, or 128, or 256, or 512, or 1024, or 2048, or 4096, or 8192, and/or wherein the second threshold is equal to 16, or 32, or 64, or 128, or 256, and/or wherein the third threshold is equal to 32 or 64 or 128 or 256, and/or wherein the fourth threshold is equal to 32 or 64 or 128 or 256, and/or wherein the fifth threshold is equal to 4 or 8 or 16 or 32 or 64, or wherein the sixth threshold is equal to 4 or 8 or 16 or 32 or 64, and/or wherein the seventh threshold is 2 or 4 or 8 or 16 or 32, and/or wherein TidTh1 is equal to 0 or 1 or 2, and/or wherein TidTh2 is equal to 3 or 4 or 5 or 6.
6 . The method of claim 5 , wherein at least one of: the first, second, third, fourth, fifth, sixth or seventh threshold depends on slice type or picture type.
7 . The method of claim 1 , wherein the indication of the target coding mode is signaled based on a condition, wherein the condition comprises at least one of: whether LIC is allowed or used, whether AMVP mode is used, whether uni-prediction/bi-prediction is used, whether AMVP-merge mode is allowed or used, whether LIC with multiple templates is allowed or used, whether LIC with multiple models is allowed or used, block dimensions and/or block size, block depth, slice type, picture type, partition tree type, temporal layer identification, block location, colour format, or colour component.
8 . The method of claim 7 , wherein an indication of the target coding mode is not signaled, if the block size is larger than or equal to an eighth threshold, wherein the block size is represented as W×H, W and H denote block width and block height, respectively, and/or wherein an indication of the target coding mode is not signaled, if the block size is less than or equal to a ninth threshold, wherein the block size is represented as W×H, W and H denote block width and block height, respectively, and/or wherein an indication of the target coding mode is not signaled, if a block width is larger than or equal to a tenth threshold and/or a block height is larger than or equal to an eleventh threshold, and/or wherein an indication of the target coding mode is not signaled, if a block width is less than or equal to a twelfth threshold and/or a block height is less than or equal to a thirteenth threshold, and/or wherein an indication of the target coding mode is not signaled, if a ratio of block width to block heigh or a ratio of block height to block width is larger than or equal to a fourteenth threshold, and/or wherein the blocks size is luma block size, or the block size is chroma block size, and/or wherein an indication of the target coding mode is not signaled, if the temporal layer identification is less than or equal to TidTh3, and/or wherein an indication of the target coding mode is not signaled, if the temporal layer identification is larger than or equal to TidTh4.
9 . The method of claim 8 , wherein the eighth threshold is equal to 64, or 128, or 256, or 512, or 1024, or 2048, or 4096, and/or wherein the ninth threshold is equal to 16, or 32, or 64, or 128, or 256, and/or wherein the tenth threshold is equal to 32 or 64 or 128 or 256, and/or wherein the eleventh threshold is equal to 32 or 64 or 128 or 256, and/or wherein the twelfth threshold is equal to 4 or 8 or 16 or 32 or 64, and/or wherein the thirteenth threshold is equal to 4 or 8 or 16 or 32 or 64, and/or wherein the seventh fourteenth is 2 or 4 or 8 or 16 or 32, and/or wherein TidTh3 is equal to 0 or 1 or 2, and/or wherein TidTh4 is equal to 3 or 4 or 5 or 6.
10 . The method of claim 9 , wherein at least one of: the eighth, ninth, tenth, eleventh, twelfth, thirteenth or fourteenth threshold depends on slice type or picture type.
11 . The method of claim 1 , wherein whether a current block is coded with the target coding mode is signaled using one or more syntax elements (SEs).
12 . The method of claim 11 , wherein if a first SE indicates that the target coding mode is applicable, a second SE is signaled to indicate whether the target coding mode is used.
13 . The method of claim 12 , wherein the first SE is at one of: a sequence header, a picture header, a sequence parameter set (SPS), a video parameter set (VPS), a decoding parameter set (DPS), a decoding capability information (DCI), a picture parameter set (PPS), an adaptation parameter sets (APS), a slice header, or a tile group header.
14 . The method of claim 12 , wherein the first SE comprises an SPS flag, a PPS flag, or a slice flag, and/or wherein the first SE comprise a general constraint information syntax for target coding and/or LIC.
15 . The method of claim 11 , wherein if a second SE indicates that the target coding mode is used, a third SE is signaled to indicate how to perform the LIC-slope, and/or wherein the one or more syntax elements are binarized with fixed length coding, or truncated unary coding, or unary coding, or EG coding, or coded a flag, and/or wherein the one or more syntax elements are bypass coded or context coded.
16 . The method of claim 1 , wherein the conversion includes encoding the video unit into the bitstream.
17 . The method of claim 1 , wherein the conversion includes decoding the video unit from the bitstream.
18 . An apparatus for video processing comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to perform acts comprising: determining, for a conversion between a video unit of a video and a bitstream of the video, whether a target coding mode is applied to the video unit based on coding information or an indication, and wherein in the target coding mode, a refined prediction sample of a prediction sample in the video unit is derived by applying a function used in local illumination compensation (LIC) to the prediction sample, one or more parameters of the function are modified; and performing the conversion based on the determining.
19 . A non-transitory computer-readable storage medium storing instructions that cause a processor to perform acts comprising: determining, for a conversion between a video unit of a video and a bitstream of the video, whether a target coding mode is applied to the video unit based on coding information or an indication, and wherein in the target coding mode, a refined prediction sample of a prediction sample in the video unit is derived by applying a function used in local illumination compensation (LIC) to the prediction sample, one or more parameters of the function are modified; and performing the conversion based on the determining.
20 . A non-transitory computer-readable recording medium storing a bitstream of a video which is generated by a method performed by an apparatus for video processing, wherein the method comprises: determining whether a target coding mode is applied to a video unit of the video based on coding information or an indication, and wherein in the target coding mode, a refined prediction sample of a prediction sample in the video unit is derived by applying a function used in local illumination compensation (LIC) to the prediction sample, one or more parameters of the function are modified; and generating the bitstream of the video unit based on the determining.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of International Application No. PCT/CN2024/102046, filed on Jun. 27, 2024, which claims the benefits of International Application No. PCT/CN2023/103830, filed on Jun. 29, 2023, International Application No. PCT/CN2023/124102, filed on Oct. 11, 2023, and International Application No. PCT/CN2023/138564, filed on Dec. 13, 2023. The entire contents of these applications are hereby incorporated by reference in their entireties. FIELDS Embodiments of the present disclosure relates generally to video processing techniques, and more particularly, to local illumination compensation with slope adjustment. BACKGROUND In nowadays, digital video capabilities are being applied in various aspects of peoples' lives. Multiple types of video compression technologies, such as MPEG-2, MPEG-4, ITU-TH.263, ITU-TH.264/MPEG-4 Part 10 Advanced Video Coding (AVC), ITU-TH.265 high efficiency video coding (HEVC) standard, versatile video coding (VVC) standard, have been proposed for video encoding/decoding. However, coding efficiency of video coding techniques is generally expected to be further improved. SUMMARY Embodiments of the present disclosure provide a solution for video processing. In a first aspect, a method for video processing is proposed. The method comprises: determining, for a conversion between a video unit of a video and a bitstream of the video, whether a target coding mode is applied to the video unit based on coding information or an indication, and wherein in the target coding mode, a refined prediction sample of a prediction sample in the video unit is derived by applying a function used in local illumination compensation (LIC) to the prediction sample, one or more parameters of the function are modified; and performing the conversion based on the determining. In this way, coding performance can be improved by adjusting the parameters. In a second aspect, an apparatus for video processing is proposed. The apparatus comprises a processor and a non-transitory memory with instructions thereon. The instructions upon execution by the processor, cause the processor to perform a method in accordance with the first aspect of the present disclosure. In a third aspect, a non-transitory computer-readable storage medium is proposed. The non-transitory computer-readable storage medium stores instructions that cause a processor to perform a method in accordance with the first aspect of the present disclosure. In a fourth aspect, another non-transitory computer-readable recording medium is proposed. The non-transitory computer-readable recording medium stores a bitstream of a video which is generated by a method performed by an apparatus for video processing. The method comprises: determining whether a target coding mode is applied to a video unit of the video based on coding information or an indication, and wherein in the target coding mode, a refined prediction sample of a prediction sample in the video unit is derived by applying a function used in local illumination compensation (LIC) to the prediction sample, one or more parameters of the function are modified; and generating the bitstream of the video unit based on the determining. In a fifth aspect, a method for storing a bitstream of a video is proposed. The method comprises: determining whether a target coding mode is applied to a video unit of the video based on coding information or an indication, and wherein in the target coding mode, a refined prediction sample of a prediction sample in the video unit is derived by applying a function used in local illumination compensation (LIC) to the prediction sample, one or more parameters of the function are modified; generating the bitstream of the video unit based on the determining; and storing the bitstream in a non-transitory computer-readable recording medium. This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. BRIEF DESCRIPTION OF THE DRAWINGS Through the following detailed description with reference to the accompanying drawings, the above and other objectives, features, and advantages of example embodiments of the present disclosure will become more apparent. In the example embodiments of the present disclosure, the same reference numerals usually refer to the same components. FIG. 1 illustrates a block diagram that illustrates an example video coding system, in accordance with some embodiments of the present disclosure; FIG. 2 illustrates a block diagram that illustrates a first example video encoder, in accordance with some embodiments of the present disclosure; FIG. 3 illustrates a block diagram that illustrates an example video decoder, in accordance with some emb