US-12627825-B2 - Video coding and decoding

US12627825B2US 12627825 B2US12627825 B2US 12627825B2US-12627825-B2

Abstract

The invention relates to signalling affine mode in an encoded video stream; in particular determining a list of merge candidates corresponding to blocks neighbouring a current block; and signalling affine mode for said current block; wherein signalling said affine mode comprises decoding a context encoded flag from the data stream, and wherein the context variable for said flag is determined based on whether or not said neighbouring blocks use affine mode. Related encoding and decoding methods and devices are also disclosed.

Inventors

Guillaume Laroche
Christophe Gisquet
Patrice Onno
JONATHAN TAQUET

Assignees

CANON KABUSHIKI KAISHA

Dates

Publication Date: 20260512
Application Date: 20240701
Priority Date: 20180921

Claims (14)

1 . A method of decoding an image from a bitstream encoded using motion prediction, the method comprising: decoding a flag being capable of indicating that a current block is not skipped; decoding prediction mode information used to determine whether a prediction mode for the current block is an intra mode or an inter mode, when the flag indicates that the current block is not skipped; determining, from a plurality of prediction modes including the intra mode and the inter mode, the prediction mode used to decode the current block of the image, based on the prediction mode information; compiling a list of candidate motion predictors in the case where the inter mode is determined for the current block; and placing a candidate for subblock affine prediction as a merge candidate lower in said list than a temporal motion vector candidate, the subblock affine prediction deriving at least one motion vector per subblock in the current block by using two or three motion vectors from a block which is of the same frame as the current block, wherein a position of the candidate for subblock affine prediction in said list is determined based on whether or not the block which is of the same frame as the current block uses subblock affine prediction, the block of the same frame corresponding to a block at a position A1 at the bottom left of the current block.
2 . The method of decoding according to claim 1 , further comprising: selecting a subblock merge mode with subblock affine prediction for the current block, wherein: the flag is a first flag; selecting the subblock merge mode with subblock affine prediction comprises decoding a second flag from the bitstream using CABAC decoding; and a context variable for said second flag is determined based on whether or not a first block neighboring said current block uses subblock affine prediction, and whether or not a second block neighboring said current block uses subblock affine prediction.
3 . The method of decoding according to claim 2 , wherein the first block is located at the left of the current block and the second block is located above the current block.
4 . The method of decoding according to claim 1 , wherein in a state where the current block has a size of 16×16, the number of subblocks in the current block is 16 and at least one motion vector per subblock in the current block is to be derived by using the two or three motion vectors in subblock affine prediction.
5 . The method of decoding according to claim 1 , wherein the temporal motion vector candidate uses a motion vector in a block which is part of an image different from the image including the current block.
6 . A method of encoding an image into a bitstream using motion prediction, the method comprising: determining, from a plurality of prediction modes including an intra mode and an inter mode, a prediction mode used to encode a current block of the image; including in the bitstream a flag being capable of indicating that the current block is not skipped; including in the bitstream prediction mode information used to determine whether a prediction mode for the current block is the intra mode or the inter mode, when the flag indicates that the current block is not skipped; compiling a list of candidate motion predictors in the case where the inter mode is determined for the current block; and placing a candidate for subblock affine prediction as a merge candidate lower in said list than a temporal motion vector candidate, the subblock affine prediction deriving at least one motion vector per subblock in the current block by using two or three motion vectors from a block which is of the same frame as the current block, wherein a position of the candidate for subblock affine prediction in said list is determined based on whether or not the block which is of the same frame as the current block uses subblock affine prediction, the block of the same frame corresponding to a block at a position A1 at the bottom left of the current block.
7 . The method of encoding according to claim 6 , further comprising selecting a subblock merge mode with subblock affine prediction for the current block, wherein: the flag is a first flag; selecting the subblock merge mode with subblock affine prediction comprises encoding a second flag into the bitstream using CABAC coding; and a context variable for said second flag is determined based on whether or not a first block neighboring said current block uses subblock affine prediction, and whether or not a second block neighboring said current block uses subblock affine prediction.
8 . The method of encoding according to claim 7 , wherein the first block is located at the left of the current block and the second block is located above the current block.
9 . The method of encoding according to claim 6 , wherein in a state where the current block has a size of 16×16, the number of subblocks in the current block is 16 and at least one motion vector per subblock in the current block is to be derived by using the two or three motion vectors in subblock affine prediction.
10 . The method of encoding according to claim 6 , wherein the temporal motion vector candidate uses a motion vector in a block which is part of an image different from an image including the current block.
11 . An encoder for encoding an image into a bitstream using motion prediction, the encoder comprising at least one processor configured to function as: a unit configured to determine, from a plurality of prediction modes including an intra mode and an inter mode, a prediction mode used to encode a current block of the image; a unit configured to include in the bitstream a flag being capable of indicating that the current block is not skipped; a unit configured to include in the bitstream prediction mode information used to determine whether a prediction mode for the current block is the intra mode or the inter mode, when the flag indicates that the current block is not skipped; a unit configured to compile a list of candidate motion predictors in the case where the inter mode is determined for the current block; and a unit configured to place a candidate for subblock affine prediction as a merge candidate lower in said list than a temporal motion vector candidate, the subblock affine prediction deriving at least one motion vector per subblock in the current block by using two or three motion vectors from a block which is of the same frame as the current block, wherein a position of the candidate for subblock affine prediction in said list is determined based on whether or not the block which is of the same frame as the current block uses subblock affine prediction, the block of the same frame corresponding to a block at a position A1 at the bottom left of the current block.
12 . A decoder for decoding an image from a bitstream encoded using motion prediction, the decoder comprising at least one processor configured to function as: a unit configured to decode a flag being capable of indicating that a current block is not skipped; a unit configured to decode prediction mode information used to determine whether a prediction mode for the current block is an intra mode or an inter mode, when the flag indicates that the current block is not skipped; a unit configured to determine, from a plurality of prediction modes including the intra mode and the inter mode, the prediction mode used to decode the current block of the image, based on the prediction mode information; a unit configured to compile a list of candidate motion predictors in the case where the inter mode is determined for the current block; and a unit configured to place a candidate for subblock affine prediction as a merge candidate lower in said list than a temporal motion vector candidate, the subblock affine prediction deriving at least one motion vector per subblock in the current block by using two or three motion vectors from a block which is of the same frame as the current block, wherein a position of the candidate for subblock affine prediction in said list is determined based on whether or not the block which is of the same frame as the current block uses subblock affine prediction, the block of the same frame corresponding to a block at a position A1 at the bottom left of the current block.
13 . A non-transitory computer-readable storage medium storing a program for causing a computer to execute a method of decoding an image from a bitstream encoded using motion prediction, the method comprising: decoding a flag being capable of indicating that a current block is not skipped; decoding prediction mode information used to determine whether a prediction mode for the current block is an intra mode or an inter mode, when the flag indicates that the current block is not skipped; determining, from a plurality of prediction modes including the intra mode and the inter mode, the prediction mode used to decode the current block of the image, based on the prediction mode information; compiling a list of candidate motion predictors in the case where the inter mode is determined for the current block; and placing a candidate for subblock affine prediction as a merge candidate lower in said list than a temporal motion vector candidate, the subblock affine prediction deriving at least one motion vector per subblock in the current block by using two or three motion vectors from a block which is of the same frame as the current block, wherein a position of the candidate for subblock affine prediction in said list is determined based on whether or not the block which is of the same frame as the current block uses subblock affine prediction, the block of the same frame corresponding to a block at a position A1 at the bottom left of the current block.
14 . A non-transitory computer-readable storage medium storing a program for causing a computer to execute a method of encoding an image into a bitstream using motion prediction, the method comprising: determining, from a plurality of prediction modes including an intra mode and an inter mode, a prediction mode used to encode a current block of the image; including in the bitstream a flag being capable of indicating that the current block is not skipped; including in the bitstream prediction mode information used to determine whether a prediction mode for the current block is the intra mode or the inter mode, when the flag indicates that the current block is not skipped; compiling a list of candidate motion predictors; and placing a candidate for subblock affine prediction as a merge candidate lower in said list than a temporal motion vector candidate, the subblock affine prediction deriving at least one motion vector per subblock in the current block by using two or three motion vectors from a block which is of the same frame as the current block, wherein a position of the candidate for subblock affine prediction in said list is determined based on whether or not the block which is of the same frame as the current block uses subblock affine prediction, the block of the same frame corresponding to a block at a position A1 at the bottom left of the current block.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS This application is a continuation application of U.S. patent application Ser. No. 17/752,677, filed on May 24, 2022, which is a divisional application of U.S. patent application Ser. No. 17/275,091, filed on Mar. 10, 2021, which is a National Phase application of PCT Application No. PCT/EP2019/075079, filed on Sep. 18, 2019 and titled “VIDEO CODING AND DECODING”. This application claims the benefit under 35 U.S.C. § 119(a)-(d) of United Kingdom Patent Application No. 1815444.3, filed on Sep. 21, 2018. The above cited patent applications are incorporated herein by reference in their entirety. FIELD OF INVENTION The present invention relates to video coding and decoding. BACKGROUND Recently, the Joint Video Experts Team (JVET), a collaborative team formed by MPEG and ITU-T Study Group 16's VCEG, commenced work on a new video coding standard referred to as Versatile Video Coding (VVC). The goal of VVC is to provide significant improvements in compression performance over the existing HEVC standard (i.e., typically twice as much as before) and to be completed in 2020. The main target applications and services include—but not limited to—360-degree and high-dynamic-range (HDR) videos. In total, JVET evaluated responses from 32 organizations using formal subjective tests conducted by independent test labs. Some proposals demonstrated compression efficiency gains of typically 40% or more when compared to using HEVC. Particular effectiveness was shown on ultra-high definition (UHD) video test material. Thus, we may expect compression efficiency gains well-beyond the targeted 50% for the final standard. The JVET exploration model (JEM) uses all the HEVC tools. A further tool not present in HEVC is to use an ‘affine motion mode’ when applying motion compensation. Motion compensation in HEVC is limited to translations, but in reality there are many kinds of motion, e.g. zoom in/out, rotation, perspective motions and other irregular motions. When utilising affine motion mode, a more complex transform is applied to a block to attempt to more accurately predict such forms of motion. However, use of an affine motion mode may add to the complexity of the encode/decode process and also may add to the signal overhead. Accordingly, a solution to at least one of the aforementioned problems is desirable. In a first aspect of the present invention there is provided a method of signalling a motion prediction mode for a portion of a bitstream, the method comprising: determining an inter prediction mode used for said portion of said bitstream; signalling affine motion mode in dependence on said inter prediction mode used in said portion of said bitstream. Optionally, the inter prediction mode used is determined based on the status of a skip flag in said portion of said bitstream. Optionally, affine mode is not enabled if said skip flag is present. Optionally, the method further comprises enabling a merge mode when said affine mode is enabled. Optionally, affine mode is enabled if said inter prediction mode is Advanced Motion Vector Predictor (AMVP). Optionally, said determining is performed on the basis of a high level syntax flag, wherein said high level syntax flag indicates processing at least one of: slice level, frame level, sequence level, and Coding Tree Unit (CTU) level. Optionally, determining an inter prediction mode comprises determining a mode of one or more blocks neighbouring a current block. In a second aspect of the present invention there is provided a method of signalling a motion prediction mode in a bitstream, the method comprising: determining a mode of one or more neighbouring blocks to a current block; and in dependence on said mode(s), signalling affine motion mode for the current block Optionally, said neighbouring blocks consist solely of blocks A1 and B1. Alternatively, said neighbouring blocks comprise blocks A2 and B3; preferably consisting solely of blocks A2 and B3. Optionally, the method comprises enabling affine motion mode if one or both of said neighbouring blocks use affine motion mode. Optionally, said neighbouring blocks further comprise B0, A0 and B2. Optionally, the use of affine mode in said neighbouring blocks is determined in series and affine mode is enabled for the current block if one of said neighbouring blocks uses affine mode. Preferably, the series of the neighbouring blocks is A2, B3, B0, A0, B2. In a third aspect of the present invention there is provided a method of signalling a motion prediction mode for a portion of a bitstream, the method comprising: determining a list of merge candidates corresponding to blocks neighbouring a current block; and enabling affine mode for said current block if one or more of said merge candidates use affine mode Optionally, said list starts with the blocks which have been used to determine a context variable relating to said block. Optionally, the list starts with the blocks A2 and B3 in that order. Option