EP-4736450-A1 - MOTION VECTOR DERIVATION

EP4736450A1EP 4736450 A1EP4736450 A1EP 4736450A1EP-4736450-A1

Abstract

There is provided a method for obtaining, for a current subblock of samples, a subblock of motion compensated predicted samples. The method comprises, based on a boundary distortion criterion, selecting a motion vector, MV, pair for the current subblock of samples from a set of candidate MV pairs. The selecting comprises, for each candidate MV pair, determining a distortion value for the candidate MV pair, wherein determining a distortion value for each candidate MV pair comprises: determining a first distortion value, DV1, for a first candidate MV pair and determining a second distortion value for a second candidate MV pair, wherein DV1 is equal to a sum of a current subblock distortion value, CurrentDistortion, and a subblock boundary distortion value, BoundaryDistortion. The method comprises determining the candidate MV pair having the lowest distortion value, wherein the determined candidate MV pair is the MV pair selected for the current subblock of samples. The method comprises using the selected MV pair to produce the subblock of motion compensated prediction samples.

Inventors

ANDERSSON, KENNETH
YU, Ruoyang

Assignees

Telefonaktiebolaget LM Ericsson (publ)

Dates

Publication Date: 20260506
Application Date: 20240604

Claims (1)

CLAIMS 1. A method for obtaining, for a current subblock of samples, a subblock of motion compensated predicted samples, the method comprising: based on a boundary distortion criterion, selecting a motion vector, MV, pair for the current subblock of samples from a set of candidate MV pairs, wherein the selecting comprises: for each candidate MV pair, determining a distortion value for the candidate MV pair, wherein determining a distortion value for each candidate MV pair comprises: determining a first distortion value, DV1, for a first candidate MV pair and determining a second distortion value for a second candidate MV pair, wherein DV1 is equal to a sum of a current subblock distortion value, CurrentDistortion, and a subblock boundary distortion value, BoundaryDistortion; determining the candidate MV pair having the lowest distortion value, wherein the determined candidate MV pair is the MV pair selected for the current subblock of samples; and using the selected MV pair to produce the subblock of motion compensated prediction samples. 2. The method of claim 1, wherein BoundaryDistortion is a function of: a scaling factor, SF, a vertical boundary distortion value, vBD, and a horizontal boundary distortion value, hBD. 3. The method of claim 2, wherein BoundaryDistortion = SF × ((vBD + hBD + 2)>>2). 4. The method of any of claims 2-3, wherein vBD = (sumVBSDiff)>>4+4*(vBGDiff+vBLDiff), hBD = (sumHBSDiff)>>4+4*( hBGDiff+hBLDiff), sumHBSDiff = ∑ ெ ^ୀ ି ^ ^ ^^ ^^ ^^ (A i,0 - B i,N-1 ), sumVBSDiff = ∑ே ^ୀ ି ^ ^ ^^ ^^ ^^ (A0,j - DM-1,j), vBGDiff = – A0,N-1), vBLDiff = abs(D M-1,0 –2* D M-1,N/2 + D M-1,N-1 – (A 0,0 – 2*A 0,N/2 + A 0,N-1 )), hBGDiff = abs(B0,N-1 – BM-1,N-1 – (A0,0 – AM-1,0), and hBLDiff = abs(B0,N-1 –2* BM/2,0 + BM-1,0 – (A0,0 – 2*AM/2,0 + AM-1,0)), wherein A is the current subblock, A is an MxN subblock of values Ai,j for i=0 to M-1 and j=0 to N-1, B is a neighbouring subblock above A, B is an MxN subblock of values Bi,j for i=0 to M-1 and j=0 to N-1, D is a neighbouring subblock left of A, and D is an MxN subblock of values Bi,j for i=0 to M-1 and j=0 to N-1. 5. The method of claim 4, wherein A is derived based on CPB1 and/or CPB2, where CPB1 is a first candidate prediction subblock derived from the first MV of the first candidate MV pair and CPB2 is a second candidate prediction subblock derived from the second MV of the first candidate MV pair. 6. The method of claim 5, wherein A is equal to the average of CPB1 and CPB2, A is equal to a weighted average of CPB1 and CPB2, A is equal to CPB1, or A is equal to CPB2. 7. The method of any one of claims 1-6, wherein determining the second distortion value for the second candidate MV pair comprises determining whether the second distortion value should be determined using a second subblock boundary distortion value. 8. The method of claim 7, wherein determining whether the second distortion value should be determined using a second subblock boundary distortion value comprises: determining whether a spatial activity for a first reference subblock identified by a first MV of the second candidate MV pair satisfies a first criterion; determining whether an absolute difference between i) a first component of the first MV of the second candidate MV pair and ii) a corresponding component of a MV of a neighboring subblock satisfies a second criterion; determining whether a boundary between the current subblock and a neighboring subblock is a true edge; and/or determining whether a subblock boundary distortion value for the second candidate MV pair satisfies a third criterion. 9. The method of claim 8, further comprising: adding the subblock boundary distortion value to a current subblock distortion value to produce a total distortion value as a result of: determining that the spatial activity for the first reference subblock satisfies the first criterion; determining that the absolute difference between i) the first component of the first MV of the second candidate MV pair and ii) the corresponding component satisfies the second criterion; determining that the boundary between the current subblock and a neighboring subblock is not a true edge; and/or determining that the subblock boundary distortion value satisfies the third criterion. 10. A computer program (1143) comprising instructions (1144) which when executed by processing circuitry (1102) cause the processing circuitry (1102) to perform the method of any one of the above claims. 11. A carrier containing the computer program of claim 10, wherein the carrier is one of an electronic signal, an optical signal, a radio signal, and a computer readable storage medium (1142). 12. An apparatus (1100) for encoding a picture, the apparatus configured to perform the method of any one of claims 1-9. 13. An apparatus (1100), the apparatus comprising: a memory (1142); and processing circuitry (1102) coupled to the memory (1142), wherein the apparatus (1100) is configured to perform the method of any one of claims 1-9.

Description

MOTION VECTOR DERIVATION TECHNICAL FIELD [0001] This disclosure relates to methods and apparatus for motion vector derivation. BACKGROUND [0002] VVC and ECM [0003] Versatile Video Coding (VVC) is a block-based video codec standardized by ITU-T and MPEG. Enhanced Coding Model (ECM) is an exploratory codec which is currently under development. The aim of ECM is to demonstrate and try providing evidence of video coding capabilities beyond VVC. The current ECM version is ECM-9.0. [0004] Video and Picture [0005] A video (a.k.a., “video sequence”) comprises of a series of pictures. In VVC, each picture is identified with a picture order count (POC) value. The POC value also represents the display order of the picture. A picture with a smaller POC value is displayed before another picture with a larger POC value. [0006] Components [0007] It is common that each picture consists of three components; one luma component Y where the sample values are luma values and two chroma components Cb and Cr, where the sample values are chroma values. Each component can be described as a two-dimensional rectangular array of sample values. It is also common that the dimensions of the chroma components are smaller than the luma components by a factor of two in each dimension. For example, the size of the luma component of an HD picture would be 1920x1080 and the chroma components would each have the dimension of 960x540. Components are sometimes referred to as color components. [0008] Blocks, Subblocks, and Units [0009] A block is a two-dimensional (2D) matrix of sample values (or “samples” for short). A block may be divided into two or more subblocks, where each subblock is a matrix of samples. In video coding, each component of a picture is split into blocks and the coded video bitstream consists of a series of coded blocks. It is common in video coding that pictures are split into units that cover a specific area of the picture. [0010] Each unit consists of all blocks from all components that make up that specific area of the picture and each block belongs fully to one unit. The Coding Unit (CU) in VVC is an example of a unit. In VVC, the CUs may be split recursively to smaller CUs. The CU at the top level is referred to as the coding tree unit (CTU). A CU usually contains three coding blocks, i.e., one coding block for luma and two coding blocks for chroma. In VVC, the CUs can have size of 4x4 up to 128x128. In current ECM, the CUs can have size of 4x4 up to 256x256. [0011] Parameter sets, slice headers, and picture headers [0012] VVC specifies three types of parameter sets: the picture parameter set (PPS), the sequence parameter set (SPS), and the video parameter set (VPS). The PPS contains data that is common for all units of a picture, the SPS contains data that is common for a coded layer video sequence (CLVS), and the VPS contains data that is common for multiple CLVSs, e.g., data for multiple layers in the bitstream. [0013] The concept of slices divides the picture into independently coded slices, where decoding of one slice in a picture is independent of other slices of the same picture. Each slice has a slice header comprising syntax elements. Decoded slice header values from these syntax elements are used when decoding the slice. In VVC, a coded picture contains a picture header. The picture header contains parameters that are common for all slices of the coded picture. [0014] Intra prediction [0015] In intra prediction, also known as spatial prediction, a current block is predicted using previous decoded blocks within the same picture. The samples from the previously decoded blocks within the same picture are used to predict the samples inside the current block. A picture consisting of only intra-predicted blocks is referred to as an intra picture. [0016] Inter prediction [0017] In inter prediction, also known as temporal prediction, a current block of the current picture is predicted using blocks from previously decoded pictures (these blocks are referred to as reference blocks). The samples from the reference blocks in the previously decoded pictures are used to predict the samples inside the current block. A picture that comprises one or more inter-predicted blocks is referred to as an inter picture. The previous decoded pictures used for inter prediction are referred to as reference pictures. [0018] The location of a referenced block inside a reference picture is indicated using a vector (i.e., a set of values) (which is referred to as “motion vector (MV)”). Each MV consists of two values: an x value (a.k.a., x component) and y value (a.k.a., y component) which represents the displacements between current block and the referenced block in x or y dimension. The value of a component may have a resolution finer than an integer position. When that is the case, a filtering (typically interpolation) is done to calculate values used for prediction. FIG.1 shows an example of a MV for the current block C. The example M