EP-4736449-A1 - BLOCK VECTOR SEARCH FOR INTRA-BLOCK COPY PREDICTION FOR CODING VIDEO DATA

EP4736449A1EP 4736449 A1EP4736449 A1EP 4736449A1EP-4736449-A1

Abstract

An example device for decoding video data includes a memory configured to store video data; and a processing system comprising one or more processors implemented in circuitry, the processing system being configured to: determine that vector information for a current block of video data is to be decoded using a merge with vector difference mode; determine a search process to be used to determine a vector difference for the vector information; select a merge candidate from a merge candidate list to determine a vector predictor for the vector information; perform the search process to determine the vector difference; apply the vector difference to the vector predictor to form a final vector; and form a prediction block for the current block using the final vector.

Inventors

VERBA, Gleb
ZHANG, ZHI
SEREGIN, VADIM
KARCZEWICZ, MARTA

Assignees

QUALCOMM INCORPORATED

Dates

Publication Date: 20260506
Application Date: 20240524

Claims (20)

1. A method of decoding video data, the method comprising: determining that vector information for a current block of video data is to be coded using a merge with vector difference mode; determining a search process to be used to determine a vector difference for the vector information; selecting a merge candidate from a merge candidate list to determine a vector predictor for the vector information; performing the search process to determine the vector difference; applying the vector difference to the vector predictor to form a final vector; and forming a prediction block for the current block using the final vector.
2. The method of claim 1 , wherein determining the search process comprises decoding data indicative of the search process.
3. The method of claim 2, wherein decoding the data indicative of the search process comprises decoding at least one of a sequence parameter set (SPS), a picture parameter set (PPS), a slice header, or a block header including the data indicative of the search process.
4. The method of claim 2, wherein decoding the data indicative of the search process comprises decoding the data indicative of the search process in response to determining that the merge with vector difference mode is enabled for the current block.
5. The method of claim 4, wherein determining that the merge with vector difference mode is enabled for the current block comprises coding data of a sequence parameter set (SPS) or a picture parameter set (PPS) indicating that the merge with vector difference mode is enabled for the current block.
6. The method of claim 1, wherein selecting the merge candidate comprises decoding a merge candidate index indicative of the merge candidate in the merge candidate list.
7. The method of claim 1, wherein performing the search process comprises performing a first search in an integer pixel domain and performing a second search in a fractional pixel domain.
8. The method of claim 7, wherein performing the first search comprises performing the first search as a first search phase, and wherein performing the second search comprises performing the second search as a second search phase following the first search phase and starting at a result of the first search phase.
9. The method of claim 7, wherein performing the first search comprises testing a plurality of integer value offsets relative to the vector predictor.
10. The method of claim 1, wherein performing the search process comprises searching only potential vector differences that, when applied to the vector information, refer to a reference block for which all pixels needed for calculating a difference metric are available in a reference area.
11. The method of claim 10, wherein the pixels needed for calculating the difference metric comprise at least one of pixels needed for performing interpolation or pixels needed for performing template matching refinement.
12. The method of claim 1, further comprising decoding the current block using the prediction block.
13. The method of claim 1, further comprising encoding the current block prior to decoding the current block.
14. A device for decoding video data, the device comprising: a memory configured to store video data; and a processing system comprising one or more processors implemented in circuitry, the processing system being configured to: determine that vector information for a current block of video data is to be coded using a merge with vector difference mode; determine a search process to be used to determine a vector difference for the vector information; select a merge candidate from a merge candidate list to determine a vector predictor for the vector information; perform the search process to determine the vector difference; apply the vector difference to the vector predictor to form a final vector; and form a prediction block for the current block using the final vector.
15. The device of claim 14, wherein to determine the search process, the processing system is configured to decode data indicative of the search process.
16. The device of claim 14, wherein to select the merge candidate, the processing system is configured to decode a merge candidate index indicative of the merge candidate in the merge candidate list.
17. The device of claim 14, wherein to perform the search process, the processing system is configured to perform a first search in an integer pixel domain and perform a second search in a fractional pixel domain.
18. The device of claim 14, wherein to perform the search process, the processing system is configured to search only potential vector differences that, when applied to the vector information, refer to a reference block for which all pixels needed for determining a difference metric are available in a reference area.
19. The device of claim 14, further comprising a display configured to display decoded video data,
20. The device of claim 14, wherein the device comprises one or more of a camera, a computer, a mobile device, a broadcast receiver device, or a set-top box.

Description

BLOCK VECTOR SEARCH FOR INTRA-BLOCK COPY PREDICTION FOR CODING VIDEO DATA [0001] This application claims priority to U. S, Patent Application No. 18/673,001 , filed May 23, 2024 and U.S. Provisional Application No. 63/510,607, filed June 2.7, 2023, the entire contents of each are incorporated by reference herein. U.S. Patent Application No. 18/673,001 , filed May 23, 2024 claims the benefit of U.S. Provisional Application No. 63/510,607, filed June 27, 2023. TECHNICAL FIELD [0002] Tliis disclosure relates to video coding, including video encoding and video decoding. BACKGROUND [0003] Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, digital direct, broadcast, systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, so-called “smart phones,” video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video coding techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), ITU-T H.265/High Efficiency Video Coding (HEVC), ITU-T H.266/Versatile Video Coding (VVC), and extensions of such standards, as well as proprietary video codecs/fonnats such as AOMedia Video 1 (AVI) developed by the Alliance for Open Media. The video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques. [0004] Video coding techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (e.g., a video picture or a portion of a video picture) may be partitioned into video blocks, which may also be referred to as coding tree units (CTUs), coding runts (CUs) and/or coding nodes. Video blocks in an intra-coded (I) slice of a picture are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same picture. Video blocks in an inter-coded (P or B) slice of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures. Pictures may be referred to as frames, and reference pictures may be referred to as reference frames. SUMMARY [0005] In general, this disclosure describes techniques related to intra block copy and inter prediction in video coding systems. In particular, this disclosure describes techniques that may be used to derive a base block vector, a base motion vector, and a block vector difference candidate list, e.g., for mtra-block copy (IBC) merge mode with block vector differences (IBC-MBVD) or IBC advanced motion vector prediction (IBC AMVP) mode, lire techniques may further include derivation of a motion vector difference candidate list for affine merge with motion vector difference (MMVD) mode, geometric MMVD mode, or MMVD for regular merge mode. [0006] In particular, a vector (such as a block vector for intra-block copy (IBC) mode or a motion vector for inter-prediction) may initially be predicted using merge mode vector prediction techniques, which generally include coding of a merge candidate index. A decoder-side motion vector refinement (DMVR) technique, such as template matching refinement, may be used to refine the vector predictor determined from the merge candidate. DMVR generally includes performing a search in a search area relative to a point indicated by the vector predictor to determine an offset to be applied to the vector predictor that results in a refined vector. Per the techniques of this disclosure, a search process may be signaled (e.g., represented by an encoded syntax element in a bitstream). Signaling the search process to be used may improve efficiency of the search process and yield an offset that results in a prediction block that best represents a current block, which may reduce the bitrate associated with coding residual data for the current block. [0007] In one example, a method of decoding video data includes: determining that vector information for a current block of video data is to be coded using a merge with vector difference mode: determining a search process to be used to determine a vector difference for the vector information; selecting a merge candidate from a merge candidate list to determine a vector predictor for the vector information; performing the search process to determine the vector difference; applying the vector difference to the vector predictor to form a final vector; and forming a prediction block for the current block using the final vector. [0008 ] In another exam