US-12621485-B2 - Method, apparatus, and medium for video processing

US12621485B2US 12621485 B2US12621485 B2US 12621485B2US-12621485-B2

Abstract

Embodiments of the present disclosure provide a solution for video processing. A method for video processing is proposed. The method comprises: determining, during a conversion between a target video block of a video and a bitstream of the video, an affine prediction list of the target video block based on at least one of the following: a non-adjacent spatial candidate, or a temporal collocated candidate; and performing the conversion based on the affine prediction list.

Inventors

Zhipin Deng
Kai Zhang
Li Zhang

Assignees

BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD.
BYTEDANCE INC.

Dates

Publication Date: 20260505
Application Date: 20240607
Priority Date: 20211207

Claims (20)

1 . A method for video processing, comprising: applying, during a conversion between a target video block of a video and a bitstream of the video, a bi-directional optical flow (BDOF) process to the target video block, wherein the target video block is coded with a geometric partitioning mode (GPM) mode or a subblock-based temporal motion vector prediction (SbTMVP) mode; and performing the conversion based on the applying.
2 . The method of claim 1 , wherein the BDOF process comprises a BDOF with motion vector refinement.
3 . The method of claim 1 , further comprising: applying the BDOF process to a further video block of the video, wherein the further video block is coded with at least one of: an affine advanced motion vector prediction (AMVP) mode, an affine merge mode with motion vector difference (MMVD) mode, or an affine merge mode.
4 . The method of claim 1 , wherein the target video block is coded with the GPM mode, and the GPM mode comprises at least one of the following: a regular GPM mode, a GPM-template matching (GPM-TM) mode, a GPM-motion vector difference (GPM-MMVD) mode, a GPM-Inter-Intra mode, or a variant of the GPM mode.
5 . The method of claim 1 , further comprising: applying the BDOF process for a prediction sample value refinement.
6 . The method of claim 1 , further comprising: applying the BDOF process for a subblock-based motion vector refinement.
7 . The method of claim 1 , wherein a BDOF refined motion field is used as a motion candidate for a further block in a current picture.
8 . The method of claim 1 , further comprising: applying the BDOF process for a video unit coded by an un-equal weighted prediction.
9 . The method of claim 8 , wherein the un-equal weighted prediction comprises one of the following: a bi-prediction with coding unit (CU)-level weight (BCW) with unequal weights for two directional inter predictions, or a GPM with unequal weights for the two directional inter predictions from two GPM partitions.
10 . The method of claim 1 , further comprising: cascading the BDOF process with a second prediction blending process in addition to a first prediction blending process.
11 . The method of claim 10 , wherein the first prediction blending process comprises a regular bi-prediction averaging process.
12 . The method of claim 10 , wherein the second prediction blending process comprises one of the following: a local illumination compensation (LIC) process, or a weighted prediction.
13 . The method of claim 1 , further comprising: applying the BDOF process to a video unit adaptively.
14 . The method of claim 13 , wherein applying the BDOF process to a video unit adaptively comprises: applying the BDOF process to the video unit under a pre-defined condition.
15 . The method of claim 14 , wherein applying the BDOF process to the video unit under a pre-defined condition comprises: if the pre-defined condition is met for the video unit, applying the BDOF process to the video unit without including an indication of the BDOF process in the bitstream.
16 . The method of claim 1 , wherein the conversion includes encoding the target video block into the bitstream.
17 . The method of claim 1 , wherein the conversion includes decoding the target video block from the bitstream.
18 . An apparatus for processing video data comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to: apply, during a conversion between a target video block of a video and a bitstream of the video, a bi-directional optical flow (BDOF) process to the target video block, wherein the target video block is coded with a geometric partitioning mode (GPM) mode or a subblock-based temporal motion vector prediction (SbTMVP) mode; and perform the conversion based on the applying.
19 . A non-transitory computer-readable storage medium storing instructions that cause a processor to perform a method performed by a video processing apparatus, wherein the method comprises: applying a bi-directional optical flow (BDOF) process to a target video block of the video, wherein the target video block is coded with a geometric partitioning mode (GPM) mode or a subblock-based temporal motion vector prediction (SbTMVP) mode; and performing the conversion based on the applying.
20 . A method for storing a bitstream of a video, comprising: applying a bi-directional optical flow (BDOF) process to a target video block of the video, wherein the target video block is coded with a geometric partitioning mode (GPM) mode or a subblock-based temporal motion vector prediction (SbTMVP) mode; and generating the bitstream based on the applying; and storing the bitstream in a non-transitory computer-readable recording medium.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation of International Application No. PCT/CN2022/137207, filed on Dec. 7, 2022, which claims the benefit of International Application No. PCT/CN2021/135943 filed on Dec. 7, 2021. The entire contents of these applications are hereby incorporated by reference in their entireties. FIELD Embodiments of the present disclosure relates generally to video coding techniques, and more particularly, to affine prediction list construction. BACKGROUND In nowadays, digital video capabilities are being applied in various aspects of peoples' lives. Multiple types of video compression technologies, such as MPEG-2, MPEG-4, ITU-TH.263, ITU-TH.264/MPEG-4 Part 10 Advanced Video Coding (AVC), ITU-TH.265 high efficiency video coding (HEVC) standard, versatile video coding (VVC) standard, have been proposed for video encoding/decoding. However, coding efficiency of conventional video coding techniques is generally very low, which is undesirable. SUMMARY Embodiments of the present disclosure provide a solution for video processing. In a first aspect, a method for video processing is proposed. The method comprises: determining, during a conversion between a target video block of a video and a bitstream of the video, an affine prediction list of the target video block based on at least one of the following: a non-adjacent spatial candidate, or a temporal collocated candidate; and performing the conversion based on the affine prediction list. The method in accordance with the first aspect of the present disclosure determines the affine prediction list based on the non-adjacent spatial candidate or the template collocated candidate, and thus improve the coding efficiency and coding effectiveness. In a second aspect, another method for video processing is proposed. The method comprises: determining, during a conversion between a target video block of a video and a bitstream of the video, at least one multi-hypothesis prediction (MHP) hypothesis of the target video block; applying a local illumination compensation (LIC) to the at least one MHP hypothesis; and performing the conversion based on the applying. The method in accordance with the second aspect of the present disclosure applies LIC to the MHP hypothesis, and thus improve the coding efficiency and coding effectiveness. In a third aspect, another method for video processing is proposed. The method comprises: determining, during a conversion between a target video block of a video and a bitstream of the video, a motion candidate of the target video block; performing a plurality of rounds of search for a template matching-based bi-direction motion refinement on the motion candidate; and performing the conversion based on the performing. The method in accordance with the third aspect of the present disclosure performs a plurality of rounds of search for the template matching-based bi-direction motion refinement on the motion candidate, and thus improve the coding efficiency and coding effectiveness. In a fourth aspect, another method for video processing is proposed. The method comprises: applying, during a conversion between a target video block of a video and a bitstream of the video, a bi-directional optical flow (BDOF) process to at least one of the following modes of the target video block: a symmetric motion vector difference (SMVD) mode, an affine mode, a combined inter and intra prediction (CUP) mode, a geometric partitioning mode (GPM) mode, a subblock-based temporal motion vector prediction (SbTMVP) mode, or a multi-hypothesis prediction (MHP) mode; and performing the conversion based on the applying. The method in accordance with the fourth aspect of the present disclosure applies the BDOF process to a coding mode, and thus improve the coding efficiency and coding effectiveness. In a fifth aspect, another method for video processing is proposed. The method comprises: applying, during a conversion between a target video block of a video and a bitstream of the video, a prediction refinement with optical flow (PROF) process to a plurality of luma predictions of the target video block; and performing the conversion based on the applying. The method in accordance with the fifth aspect of the present disclosure applies the PROF process to the luma predictions, and thus improve the coding efficiency and coding effectiveness. In a sixth aspect, another method for video processing is proposed. The method comprises: applying, during a conversion between a target video block of a video and a bitstream of the video, a sample-based prediction refinement with optical flow (PROF) process for the target video block; and performing the conversion based on the applying. The method in accordance with the sixth aspect of the present disclosure applies the sample-based PROF process for the video block, and thus improve the coding efficiency and coding effectiveness. In a seventh aspect, an apparatus for processing video dat