Search

EP-4740460-A1 - VIDEO CODING IN A MERGE MOTION VECTOR DIFFERENCE MODE UTILIZING DYNAMICALLY GENERATED PROBABILITIES

EP4740460A1EP 4740460 A1EP4740460 A1EP 4740460A1EP-4740460-A1

Abstract

Apparatuses and methods are disclosed including techniques for encoding and decoding video data. Techniques disclosed provide for coding a video block of the video data, including adaptively modifying a set of refinement vectors based on respective probability values and selecting a refinement vector from the modified set of refinement vectors, where the selected refinement vector is applied to refine a base motion vector. An index indicating the selected refinement vector is then coded into the bitstream. Techniques disclosed also provide for decoding the video block of the video data, including adaptively modifying the set of refinement vectors based on the probability values, decoding from the bitstream the index indicating the refinement vector from the modified set of refinement vectors, and extracting the refinement vector from the modified set of refinement vectors based on the index.

Inventors

  • LE GUYADEC, PASCAL
  • PURI, Saurabh
  • POIRIER, Tangi
  • THIEBAUD, SYLVAIN
  • BOSSEN, FRANK

Assignees

  • InterDigital CE Patent Holdings, SAS

Dates

Publication Date
20260513
Application Date
20240624

Claims (20)

  1. 1. A method for encoding video data, comprising: coding, into a bitstream, a video block of the video data, the coding of the video block comprises: adaptively modifying a set of refinement vectors based on respective probability values, selecting a refinement vector from the modified set of refinement vectors, the selected refinement vector is applied to refine a base motion vector, and coding, into the bitstream, an index indicating the selected refinement vector.
  2. 2. The method according to claim 1, further comprising: coding, into the bitstream, a flag indicating that the set of refinement vectors is modified.
  3. 3. The method according to claim 1 or 2, wherein the adaptively modifying the set of refinement vectors further comprises: deriving a motion activity indicator, characterizing content motion in a neighborhood of the video block; and modifying the set of refinement vectors based on the motion activity indicator.
  4. 4. The method according to claim 3, wherein the set of refinement vectors is modified if the motion activity indicator is above a predetermined threshold.
  5. 5. The method according to any one of claims 3 to 4, wherein the deriving of the motion activity indicator comprises: determining the motion activity indicator based on dynamic coding data, including one or more of the respective probability values, a quantization parameter associated with the video block, a size associated with the video block, or a combination thereof.
  6. 6. The method according to any one of claims 1 to 5, wherein a refinement vector in the set of refinement vectors is defined by a distance value from a distance table and by a direction value from a direction table.
  7. 7. The method according to claim 6, wherein the adaptively modifying the set of refinement vectors comprises: modifying the distance table, the modifying includes reducing a number of elements of the distance table, reordering the elements of the distance table, changing one or more values of the elements of the distance table, or a combination thereof.
  8. 8. The method according to claim 6, wherein the adaptively modifying the set of refinement vectors comprises: modifying the direction table, the modifying includes reducing a number of elements of the direction table, reordering the elements of the direction table, changing one or more values of the elements of the direction table, or a combination thereof.
  9. 9. The method according to any one of claims 1 to 8, further comprising: adaptively modifying a list of merge candidates based on one or more of respective probability values, wherein a merge video block is selected from the modified list of merge candidates, and wherein the base motion vector is a motion vector of the merge video block; and coding, into the bitstream, an index indicating the merge video block.
  10. 10. The method according to claim 9, further comprising: coding, into the bitstream, a flag indicating that the list of merge candidates is modified.
  11. 11. A method for decoding video data, comprising: decoding, from a bitstream, a video block of the video data, the decoding of the video block comprises: adaptively modifying a set of refinement vectors based on respective probability values, decoding, from the bitstream, an index indicating a refinement vector from the modified set of refinement vectors, and extracting, based on the index, the refinement vector from the modified set of refinement vectors, the extracted refinement vector is applied to refine a base motion vector.
  12. 12. The method according to claim 11, further comprising: decoding, from the bitstream, a flag indicating whether the set of refinement vectors should be modified; and performing the adaptively modifying of the set of refinement vectors if the flag indicates that the set of refinement vectors should be modified.
  13. 13. The method according to claim 11 or 12, wherein the adaptively modifying the set of refinement vectors further comprises: deriving a motion activity indicator, characterizing content motion in a neighborhood of the video block; and modifying the set of refinement vectors based on the motion activity indicator, wherein the set of refinement vectors is modified if the motion activity indicator is above a predetermined threshold.
  14. 14. The method according to claim 13, wherein the deriving of the motion activity indicator comprises: determining the motion activity indicator based on dynamic coding data, including one or more of the respective probability values, a quantization parameter associated with the video block, a size associated with the video block, or a combination thereof.
  15. 15. The method according to any one of claims 11 to 14, wherein a refinement vector in the set of refinement vectors is defined by a distance value from a distance table and by a direction value from a direction table.
  16. 16. The method according to claim 15, wherein the adaptively modifying the set of refinement vectors comprises: modifying the distance table, the modifying includes reducing a number of elements of the distance table, reordering the elements of the distance table, changing one or more values of the elements of the distance table, or a combination thereof.
  17. 17. The method according to claim 15, wherein the adaptively modifying the set of refinement vectors comprises: modifying the direction table, the modifying includes reducing a number of elements of the direction table, reordering the elements of the direction table, changing one or more values of the elements of the direction table, or a combination thereof.
  18. 18. The method according to any one of claims 11 to 17, further comprising: adaptively modifying a list of merge candidates based on one or more of respective probability values; decoding, from the bitstream, an index indicating the merge video block; and extracting, based on the index, a merge video block from the modified list of merge candidate, wherein the base motion vector is a motion vector of the merge video block.
  19. 19. The method according to claim 18, further comprising: decoding, from the bitstream, a flag indicating whether the list of merge candidates should be modified; and performing the adaptively modifying of the list of merge candidates if the flag indicates that the list of merge candidates should be modified.
  20. 20. An apparatus for encoding video data, comprising: at least one processor; and memory storing instructions that, when executed by the at least one processor, cause the apparatus to: code, into a bitstream, a video block of the video data, the coding of the video block comprises: adaptively modifying a set of refinement vectors based on respective probability values, selecting a refinement vector from the modified set of refinement vectors, the selected refinement vector is applied to refine a motion vector of a merge video block, and coding, into the bitstream, an index indicating the selected refinement vector.

Description

VIDEO CODING IN A MERGE MOTION VECTOR DIFFERENCE MODE UTILIZING DYNAMICALLY GENERATED PROBABILITIES CROSS REFERENCE TO RELATED APPLICATIONS [1] This application claims the benefit of European Application No. 23306121.7, filed on July 3, 2023, which is incorporated herein by reference in its entirety. BACKGROUND [2] In the recent versatile video coding (VVC) standard and enhanced compression model (ECM), several enhancements have been made to the coding of motion associated with a video block. Particularly, a merge motion vector difference (MMVD) mode has been introduced to enhance motion coding in a merge mode. Operating in an MMVD mode allows for the refinement of a motion vector (namely, a base motion vector) derived in a merge mode. The refined base motion vector can then be used for motion compensated prediction of the video block. The best refinement vector, selected from a refinement vector set, is used to refine a base motion vector, selected from motion vectors of respective merge video block candidates. The larger the refinement vector set is and the larger the number of base motion vectors being tested is, the more accurate the refined base motion vector that can be found will be. However, it is computationally costly to test a large refinement vector set for each of the base motion vectors to find the refinement vector that results in the most accurate refined base motion vector. SUMMARY [3] Aspects disclosed in the present disclosure describe methods for encoding video data. The methods include coding a video block of the video data into a bitstream. The coding of the video block comprises adaptively modifying a set of refinement vectors based on respective probability values and selecting a refinement vector from the modified set of refinement vectors, where the selected refinement vector is applied to refine a base motion vector. An index indicating the selected refinement vector is then coded into the bitstream. Aspects disclosed in the present disclosure also describe methods for decoding the video data. The methods include decoding a video block of the video data from the bitstream. The decoding of the video block comprises adaptively modifying a set of refinement vectors based on respective probability values. The decoding of the video block further includes decoding from the bitstream an index indicating a refinement vector from the modified set of refinement vectors. Based on the index, the refinement vector is extracted from the modified set of refinement vectors. [4] Aspects disclosed in the present disclosure describe an apparatus for encoding video data. The apparatus comprises at least one processor and memory storing instructions. The instructions, when executed by the at least one processor, cause the apparatus to code a video block of the video data into a bitstream. The coding of the video block comprises adaptively modifying a set of refinement vectors based on respective probability values and selecting a refinement vector from the modified set of refinement vectors, where the selected refinement vector is applied to refine a base motion vector. An index indicating the selected refinement vector is then coded into the bitstream. Aspects disclosed in the present disclosure also describe an apparatus for decoding video data. The apparatus comprises at least one processor and memory storing instructions. The instructions, when executed by the at least one processor, cause the apparatus to decode a video block of the video data from the bitstream. The decoding of the video block comprises adaptively modifying a set of refinement vectors based on respective probability values. The decoding of the video block further includes decoding from the bitstream an index indicating a refinement vector from the modified set of refinement vectors. Based on the index, the refinement vector is extracted from the modified set of refinement vectors. [5] Further aspects disclosed in the present disclosure describe a non-transitory computer- readable medium comprising instructions executable by at least one processor to perform methods for encoding video data. The methods include coding a video block of the video data into a bitstream. The coding of the video block comprises adaptively modifying a set of refinement vectors based on respective probability values and selecting a refinement vector from the modified set of refinement vectors, where the selected refinement vector is applied to refine a base motion vector. An index indicating the selected refinement vector is then coded into the bitstream. Further aspects disclosed in the present disclosure also describe a non-transitory computer-readable medium comprising instructions executable by at least one processor to perform methods for encoding video data. The methods include decoding a video block of the video data from the bitstream. The decoding of the video block comprises adaptively modifying a set of refinement vectors based on respective