Search

CN-121986486-A - Inheriting local illumination compensation parameters in merge mode in video coding

CN121986486ACN 121986486 ACN121986486 ACN 121986486ACN-121986486-A

Abstract

The video coder is configured to receive a block of video data to be coded using a merge mode and Local Illumination Compensation (LIC) and determine a merge candidate for the block of video data. If the merge candidate is a non-neighboring candidate or a history-based motion vector predictor candidate, the video coder is configured to inherit the LIC parameters associated with the merge candidate. If the merge candidate is a neighboring candidate, the video coder is configured to derive the LIC parameters for the block of video data using the neighboring templates of the reconstructed samples and the reference templates in the reference frame.

Inventors

  • HUANG HAN
  • V. Shelau gold
  • M. Kachevwitz

Assignees

  • 高通股份有限公司

Dates

Publication Date
20260505
Application Date
20240916
Priority Date
20240913

Claims (20)

  1. 1. A method of coding video data, the method comprising: Receiving a first block of video data to be decoded using a merge mode and Local Illumination Compensation (LIC); determining a first merge candidate for the first block of video data; Inheriting a first LIC parameter associated with the first merge candidate based on the first merge candidate being a non-neighboring candidate or a history-based motion vector predictor candidate, and The first block of video data is coded using the first merge candidate and the first LIC parameter.
  2. 2. The method of claim 1, wherein the first merge candidate is the non-neighboring candidate that is a threshold number of samples away from a boundary of the first block of video data.
  3. 3. The method of claim 1, wherein the first merge candidate is the non-neighbor candidate, the method further comprising: The first LIC parameters for the non-neighboring candidates are stored within a current coding tree unit of the first block of video data or at a boundary of the current coding tree unit of the first block of video data based on the non-neighboring candidates.
  4. 4. The method of claim 1, wherein the first merge candidate is the history-based motion vector predictor candidate, the method further comprising: the first LIC parameter is stored with history-based motion vector prediction candidates.
  5. 5. The method of claim 1, wherein the first LIC parameter is a single LIC parameter set stored for an nxn region of the video data.
  6. 6. The method of claim 1, the method further comprising: The first LIC parameters are quantized before they are stored in a buffer and before they are inherited.
  7. 7. The method of claim 1, the method further comprising: receiving a second block of video data to be decoded using the merge mode and the LIC; Determining a second merge candidate for the second block of video data; deriving a second LIC parameter for the second block of video data using a neighboring template of reconstructed samples and a reference template in a reference frame based on the second merge candidate being a neighboring candidate, and The second block of video data is coded using the second merge candidate and the second LIC parameter.
  8. 8. The method of claim 1, wherein coding comprises decoding.
  9. 9. The method of claim 1, wherein coding comprises encoding.
  10. 10. A device configured to code video data, the device comprising: memory, and Processing circuitry in communication with the memory, the processing circuitry configured to: Receiving a first block of video data to be decoded using a merge mode and Local Illumination Compensation (LIC); determining a first merge candidate for the first block of video data; Inheriting a first LIC parameter associated with the first merge candidate based on the first merge candidate being a non-neighboring candidate or a history-based motion vector predictor candidate, and The first block of video data is coded using the first merge candidate and the first LIC parameter.
  11. 11. The device of claim 10, wherein the first merge candidate is the non-neighboring candidate that is a threshold number of samples away from a boundary of the first block of video data.
  12. 12. The apparatus of claim 10, wherein the first merge candidate is the non-neighbor candidate, and wherein the processing circuit is further configured to: The first LIC parameters for the non-neighboring candidates are stored within a current coding tree unit of the first block of video data or at a boundary of the current coding tree unit of the first block of video data based on the non-neighboring candidates.
  13. 13. The apparatus of claim 10, wherein the first merge candidate is the history-based motion vector predictor candidate, and wherein the processing circuit is further configured to: the first LIC parameter is stored with history-based motion vector prediction candidates.
  14. 14. The device of claim 10, wherein the first LIC parameter is a single LIC parameter set stored for an nxn region of the video data.
  15. 15. The apparatus of claim 10, wherein the processing circuit is further configured to: The first LIC parameters are quantized before they are stored in a buffer and before they are inherited.
  16. 16. The apparatus of claim 10, wherein the processing circuit is further configured to: receiving a second block of video data to be decoded using the merge mode and the LIC; Determining a second merge candidate for the second block of video data; deriving a second LIC parameter for the second block of video data using a neighboring template of reconstructed samples and a reference template in a reference frame based on the second merge candidate being a neighboring candidate, and The second block of video data is coded using the second merge candidate and the second LIC parameter.
  17. 17. The apparatus of claim 10, wherein the processing circuitry is configured to decode video data.
  18. 18. The apparatus of claim 10, wherein the processing circuitry is configured to encode video data.
  19. 19. A non-transitory computer-readable storage medium storing instructions that, when executed, cause one or more processors of a device configured to code video data to: Receiving a first block of video data to be decoded using a merge mode and Local Illumination Compensation (LIC); determining a first merge candidate for the first block of video data; Inheriting a first LIC parameter associated with the first merge candidate based on the first merge candidate being a non-neighboring candidate or a history-based motion vector predictor candidate, and The first block of video data is coded using the first merge candidate and the first LIC parameter.
  20. 20. The non-transitory computer-readable storage medium of claim 19, wherein the instructions further cause the one or more processors to: receiving a second block of video data to be decoded using the merge mode and the LIC; Determining a second merge candidate for the second block of video data; deriving a second LIC parameter for the second block of video data using a neighboring template of reconstructed samples and a reference template in a reference frame based on the second merge candidate being a neighboring candidate, and The second block of video data is coded using the second merge candidate and the second LIC parameter.

Description

Inheriting local illumination compensation parameters in merge mode in video coding The present application claims priority from U.S. patent application Ser. No. 18/884,960, filed on day 13 of 9 of 2024, and U.S. provisional application Ser. No. 63/590,339, filed on day 13 of 10 of 2023, each of which is hereby incorporated by reference in its entirety. U.S. patent application Ser. No. 18/884,960 claims the benefit of U.S. provisional application Ser. No. 63/590,339. Technical Field The present disclosure relates to video encoding and video decoding. Background Digital video capabilities can be incorporated into a wide variety of devices, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal Digital Assistants (PDAs), laptop or desktop computers, tablet computers, electronic book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video gaming consoles, cellular or satellite radio telephones (so-called "smartphones"), video teleconferencing devices, video streaming devices, and the like. Digital video devices implement video coding techniques such as those described in standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4 (part 10, advanced Video Coding (AVC)), ITU-T H.265/High Efficiency Video Coding (HEVC), ITU-T H.266/multifunctional video coding (VVC), and extensions of such standards, as well as proprietary video codecs/formats such as AOMedia Video (AV 1) developed by the open media alliance. By implementing such video coding techniques, video devices may more efficiently transmit, receive, encode, decode, and/or store digital video information. Video coding techniques include spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (e.g., a video picture or a portion of a video picture) may be divided into video blocks, which may also be referred to as Coding Tree Units (CTUs), coding Units (CUs), and/or coding nodes. Video blocks in slices after intra coding (I) of a picture are encoded using spatial prediction with respect to reference samples in neighboring blocks in the same picture. Video blocks in slices after inter coding (P or B) of a picture may use spatial prediction with respect to reference samples in neighboring blocks in the same picture or temporal prediction with respect to reference samples in other reference pictures. A picture may be referred to as a frame and a reference picture may be referred to as a reference frame. Disclosure of Invention In general, this disclosure describes techniques for inter prediction in video codecs. More specifically, the present disclosure describes techniques related to interactions between Local Illumination Compensation (LIC) and merge modes. LIC is an inter prediction technique for modeling local illumination changes between a current block and its predicted block as a function of a current block template and a reference block template. In one example of the present disclosure, a video coder may be configured to determine LIC parameters for a current block of video data based on a type of merge candidate used to code the current block of video data in a merge mode. If the merge candidate is a non-neighboring candidate or a history-based motion vector predictor, the video coder may be configured to inherit the LIC parameters from the merge candidate. That is, instead of deriving new LIC parameters, the video coder may reuse the determined LIC parameters associated with the merge candidates. If the merge candidate is a neighboring merge candidate, the video coder may use the neighboring templates of reconstructed samples and the reference templates in the reference frame to derive new LIC parameters for the current block. The use of non-neighboring merge candidates or history-based motion vector predictors may indicate that neighboring blocks (e.g., neighboring merge candidates) differ significantly from the current block, at least in terms of having relevant motion information. Hence, deriving new LIC parameters using neighboring samples in this case may lead to less ideal LIC parameters. Thus, inheriting the LIC parameters associated with non-neighboring merge candidates or history-based motion vector predictors may result in better coding efficiency and/or reduced distortion. In one example, the present disclosure describes a method of coding video data that includes receiving a first block of video data to be coded using a merge mode and an LIC, determining a first merge candidate for the first block of video data, inheriting a first LIC parameter associated with the first merge candidate based on the first merge candidate being a non-neighboring candidate or based on a historical motion vector predictor candidate, and coding the first block of video data using the first merge candidate and the first LIC parameter.