US-12621461-B2 - Coding affine motion models for video coding
Abstract
A video decoder may receive a block of video data to be decoded using a 6-parameter affine advanced motion vector predictor (AMVP) mode. The video decoder may decode a first syntax element indicating a first motion vector difference (MVD) for a first control point motion vector (CPMV) for the block, and also decode a flag that indicates if a second MVD for a second CPMV for the block and a third MVD for a third CPMV for the block are equal to the first MVD. The video decoder may further determine the second MVD and the third MVD based on the flag, and decode the block of video data using the first MVD, the second MVD, and the third MVD to generate a decoded block.
Inventors
- Han Huang
- Vadim SEREGIN
- Zhi Zhang
- Marta Karczewicz
Assignees
- QUALCOMM INCORPORATED
Dates
- Publication Date
- 20260505
- Application Date
- 20240320
Claims (20)
- 1 . A method of decoding video data, the method comprising: receiving a block of video data to be decoded using a 6-parameter affine advanced motion vector predictor (AMVP) mode; decoding a first syntax element indicating a first motion vector difference (MVD) for a first control point motion vector (CPMV) for the block; decoding a flag that indicates whether a second MVD for a second CPMV for the block and a third MVD for a third CPMV for the block are both to be equal to the first MVD; determining the second MVD and the third MVD based on the flag; and decoding the block of video data using the first MVD, the second MVD, and the third MVD to generate a decoded block.
- 2 . The method of claim 1 , wherein the flag indicates the second MVD and the third MVD are both to be equal to the first MVD, and wherein determining the second MVD and the third MVD based on the flag comprises: setting the second MVD to be equal to the first MVD; and setting the third MVD to be equal to the first MVD.
- 3 . The method of claim 1 , wherein the flag indicates the second MVD and the third MVD are not both to be equal to the first MVD, and wherein determining the second MVD and the third MVD based on the flag comprises: decoding a second syntax element indicating the second MVD for the second CPMV of the block; and decoding a third syntax element indicating the third MVD for the third CPMV of the block.
- 4 . The method of claim 1 , wherein the 6-parameter AMVP mode is a 6-parameter bi-directional AMVP mode, and wherein decoding the flag comprises: decoding a single flag that indicates if the second MVD for the second CPMV for the block and the third MVD for a third CPMV for the block are both to be equal to the first MVD, wherein the single flag is applicable to both prediction directions for the 6-parameter bi-directional AMVP mode.
- 5 . The method of claim 1 , wherein the 6-parameter AMVP mode is a 6-parameter bi-directional AMVP mode, and wherein decoding the flag comprises: decoding a first flag for a first prediction direction that indicates whether the second MVD for the second CPMV for the block and the third MVD for a third CPMV for the block are both to be equal to the first MVD; and decoding a second flag for a second prediction direction that indicates whether the second MVD for the second CPMV for the block and the third MVD for a third CPMV for the block are both to be equal to the first MVD.
- 6 . The method of claim 1 , wherein the 6-parameter AMVP mode is a 6-parameter bi-directional AMVP mode, wherein the flag indicates the second MVD and the third MVD are both to be equal to the first MVD, and wherein the method further comprises: determining the first MVD to be for a first prediction direction; setting the second MVD to be equal to the first MVD for the first prediction direction; setting the third MVD to be equal to the first MVD for the first prediction direction; and setting all MVDs to zero for a second prediction direction.
- 7 . The method of claim 1 , wherein the 6-parameter AMVP mode is a 6-parameter bi-directional AMVP mode, wherein the flag indicates the second MVD and the third MVD are both to be equal to the first MVD, and wherein the method further comprises: determining the first MVD to be for a first prediction direction; setting the second MVD to be equal to the first MVD for the first prediction direction; setting the third MVD to be equal to the first MVD for the first prediction direction; deriving a first MVD for a second prediction direction from the first MVD for the first prediction direction; setting a second MVD for the second prediction direction to be equal to the first MVD for the second prediction direction; and setting a third MVD for the second prediction direction to be equal to the first MVD for the second prediction direction.
- 8 . An apparatus configured to decode video data, the apparatus comprising: a memory; and one or more processors implemented in circuitry and in communication with the memory, the one or more processors configured to: receive a block of video data to be decoded using a 6 -parameter affine advanced motion vector predictor (AMVP) mode; decode a first syntax element indicating a first motion vector difference (MVD) for a first control point motion vector (CPMV) for the block; decode a flag that indicates whether a second MVD for a second CPMV for the block and a third MVD for a third CPMV for the block are both to be equal to the first MVD; determine the second MVD and the third MVD based on the flag; and decode the block of video data using the first MVD, the second MVD, and the third MVD to generate a decoded block.
- 9 . The apparatus of claim 8 , wherein the flag indicates the second MVD and the third MVD are both to be equal to the first MVD, and wherein to determine the second MVD and the third MVD based on the flag, the one or more processors are further configured to: set the second MVD to be equal to the first MVD; and set the third MVD to be equal to the first MVD.
- 10 . The apparatus of claim 8 , wherein the flag indicates the second MVD and the third MVD are not to be equal to the first MVD, and wherein to determine the second MVD and the third MVD based on the flag, the one or more processors are further configured to: decode a second syntax element indicating the second MVD for the second CPMV of the block; and decode a third syntax element indicating the third MVD for the third CPMV of the block.
- 11 . The apparatus of claim 8 , wherein the 6-parameter AMVP mode is a 6-parameter bi-directional AMVP mode, and wherein to decode the flag, the one or more processors are further configured to: decode a single flag that indicates if the second MVD for the second CPMV for the block and the third MVD for a third CPMV for the block are both to be equal to the first MVD, wherein the single flag is applicable to both prediction directions for the 6-parameter bi-directional AMVP mode.
- 12 . The apparatus of claim 8 , wherein the 6-parameter AMVP mode is a 6-parameter bi-directional AMVP mode, and wherein to decode the flag, the one or more processors are further configured to: decode a first flag for a first prediction direction that indicates whether the second MVD for the second CPMV for the block and the third MVD for a third CPMV for the block are both to be equal to the first MVD; and decode a second flag for a second prediction direction that indicates whether the second MVD for the second CPMV for the block and the third MVD for a third CPMV for the block are both to be equal to the first MVD.
- 13 . The apparatus of claim 8 , wherein the 6-parameter AMVP mode is a 6-parameter bi-directional AMVP mode, wherein the flag indicates the second MVD and the third MVD are both to be equal to the first MVD, and wherein the one or more processors are further configured to: determine the first MVD to be for a first prediction direction; set the second MVD to be equal to the first MVD for the first prediction direction; set the third MVD to be equal to the first MVD for the first prediction direction; and set all MVDs to zero for a second prediction direction.
- 14 . The apparatus of claim 8 , wherein the 6-parameter AMVP mode is a 6-parameter bi-directional AMVP mode, wherein the flag indicates the second MVD and the third MVD are both to be equal to the first MVD, and wherein the one or more processors are further configured to: determine the first MVD to be for a first prediction direction; set the second MVD to be equal to the first MVD for the first prediction direction; set the third MVD to be equal to the first MVD for the first prediction direction; derive a first MVD for a second prediction direction from the first MVD for the first prediction direction; set a second MVD for the second prediction direction to be equal to the first MVD for the second prediction direction; and set a third MVD for the second prediction direction to be equal to the first MVD for the second prediction direction.
- 15 . The apparatus of claim 8 , further comprising: a display configured to display a picture that includes the decoded block.
- 16 . A method of encoding video data, the method comprising: receiving a block of video data to be encoded using a 6-parameter affine advanced motion vector predictor (AMVP) mode; determining a first motion vector difference (MVD) for a first control point motion vector (CPMV) for the block; determining a second MVD for a second CPMV for the block; determining a third MVD for a third CPMV for the block; encoding the block of video data using the first MVD, the second MVD, and the third MVD; encoding a first syntax element indicating the first MVD for the block; and encoding a flag that indicates whether the second MVD and the third MVD for the block are both equal to the first MVD.
- 17 . The method of claim 16 , wherein the second MVD and the third MVD are both equal to the first MVD, and wherein the method further comprises: refraining from encoding a second syntax element indicating the second MVD for the second CPMV of the block; and refraining from encoding a third syntax element indicating the third MVD for the third CPMV of the block.
- 18 . The method of claim 16 , wherein the second MVD and the third MVD are not both equal to the first MVD, and wherein the method further comprises: encoding a second syntax element indicating the second MVD for the second CPMV of the block; and encoding a third syntax element indicating the third MVD for the third CPMV of the block.
- 19 . The method of claim 16 , wherein the 6-parameter AMVP mode is a 6-parameter bi-directional AMVP mode, and wherein encoding the flag comprises: encoding a single flag that indicates if the second MVD for the second CPMV for the block and the third MVD for a third CPMV for the block are both equal to the first MVD, wherein the single flag is applicable to both prediction directions for the 6-parameter bi-directional AMVP mode.
- 20 . The method of claim 16 , wherein the 6-parameter AMVP mode is a 6-parameter bi-directional AMVP mode, and wherein encoding the flag comprises: encoding a first flag for a first prediction direction that indicates whether the second MVD for the second CPMV for the block and the third MVD for a third CPMV for the block are both equal to the first MVD; and encoding a second flag for a second prediction direction that indicates whether the second MVD for the second CPMV for the block and the third MVD for a third CPMV for the block are both equal to the first MVD.
Description
This application claims the benefit of U.S. Provisional Patent Application No. 63/495,957, filed Apr. 13, 2023, the entire content of which is incorporated by reference herein. TECHNICAL FIELD This disclosure relates to video encoding and video decoding. BACKGROUND Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, so-called “smart phones,” video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video coding techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), ITU-T H.265/High Efficiency Video Coding (HEVC), ITU-T H.266/Versatile Video Coding (VVC), and extensions of such standards, as well as proprietary video codecs/formats such as AOMedia Video 1 (AV1) that was developed by the Alliance for Open Media. The video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques. Video coding techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (e.g., a video picture or a portion of a video picture) may be partitioned into video blocks, which may also be referred to as coding tree units (CTUs), coding units (CUs) and/or coding nodes. Video blocks in an intra-coded (I) slice of a picture are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same picture. Video blocks in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures. Pictures may be referred to as frames, and reference pictures may be referred to as reference frames. SUMMARY In general, this disclosure describes techniques for inter prediction in video codecs. More specifically, this disclosure describes devices and techniques for coding affine motion and/or affine motion models. In one example, this disclosure describes coding a flag that indicates if a second motion vector difference (MVD) and a third MVD are equal to a first MVD. In another example, this disclosure describes signaling techniques related to an advanced motion vector prediction (AMVP)-merge mode. In another example, this disclosure describes techniques related to affine flag checking when generating an affine candidate list using a constructed affine motion vector predictor. The techniques of this disclosure may improve coding efficiency and/or improve image quality. In one example, this disclosure describes a method of decoding video data, the method comprising receiving a block of video data to be decoded using a 6-parameter affine AMVP mode, decoding a first syntax element indicating a first MVD for a first control point motion vector (CPMV) for the block, decoding a flag that indicates if a second MVD for a second CPMV for the block and a third MVD for a third CPMV for the block are equal to the first MVD, determining the second MVD and the third MVD based on the flag, and decoding the block of video data using the first MVD, the second MVD, and the third MVD to generate a decoded block. In another example, this disclosure describes an apparatus configured to decode video data, the apparatus comprising a memory, and one or more processors implemented in circuitry and in communication with the memory, the one or more processors configured to receive a block of video data to be decoded using a 6-parameter affine AMVP mode, decode a first syntax element indicating a first MVD for a first CPMV for the block, decode a flag that indicates if a second MVD for a second CPMV for the block and a third MVD for a third CPMV for the block are equal to the first MVD, determine the second MVD and the third MVD based on the flag, and decode the block of video data using the first MVD, the second MVD, and the third MVD to generate a decoded block. In another example, this disclosure describes a method of encoding video data, the method comprising receiving a block of video data to be encoded using a 6-parameter affine AMVP mode, determining a first MVD for a first CPMV for the block, determining a second MVD for a second CPMV for the block, determining a third MVD for a third CPMV for the block, encoding the block of video data using the first MVD, the second MVD, and the third MVD, encoding a first syntax element indicating the first MVD for the b