US-12627810-B2 - Motion vector prediction with derived motion trajectory

US12627810B2US 12627810 B2US12627810 B2US 12627810B2US-12627810-B2

Abstract

Decoding using motion vector prediction with derived motion trajectory includes obtaining, from previously reconstructed reference frames available for reconstructing a current frame, reference frame motion fields data for reconstructing the current frame, obtaining, using the reference frame motion fields data, trajectory mapping data for reconstructing the current frame, accessing, from the encoded bitstream, current encoded block data for a current block of the current frame; obtaining a motion vector prediction for the current block in accordance with the trajectory mapping data, obtaining a differential motion vector from the current encoded block data, obtaining a motion vector for the current block by adding the motion vector prediction and the differential motion vector, decoding the current block using the motion vector to obtain decoded block data for the current block, and obtaining reconstructed frame data for the current frame using the decoded block data.

Inventors

Bohan LI
Jingning Han
Debargha Mukherjee
Yaowu Xu

Assignees

GOOGLE LLC

Dates

Publication Date: 20260512
Application Date: 20241127

Claims (20)

1 . A method comprising: generating reconstructed video data by decoding an encoded bitstream, wherein decoding the encoded bitstream includes: obtaining, from previously reconstructed reference frames available for reconstructing a current frame, reference frame motion fields data for reconstructing the current frame; obtaining, using the reference frame motion fields data, trajectory mapping data for reconstructing the current frame; accessing, from the encoded bitstream, current encoded block data for a current block of the current frame; obtaining a motion vector prediction for the current block in accordance with the trajectory mapping data; obtaining a differential motion vector from the current encoded block data; obtaining a motion vector for the current block by adding the motion vector prediction and the differential motion vector; decoding the current block using the motion vector to obtain decoded block data for the current block; obtaining reconstructed frame data for the current frame using the decoded block data; and including the reconstructed frame data in the reconstructed video data; and outputting the reconstructed video data.
2 . The method of claim 1 , wherein obtaining the reference frame motion fields data includes: including, in the reference frame motion fields data, less than or equal to a first defined maximum cardinality of candidate reference motion fields; and for the previously reconstructed reference frames: determining whether a current cardinality of candidate reference motion fields in the reference frame motion fields data is less than a second defined maximum cardinality; and in response to determining that the current cardinality of candidate reference motion fields in the reference frame motion fields data is less than the second defined maximum cardinality: obtaining, in coding recency order, a next most recently coded reference frame; and determining whether the reference frame motion fields data includes a first portion of a motion field of the next most recently coded reference frame that is oriented toward the current frame.
3 . The method of claim 2 , wherein obtaining the reference frame motion fields data includes: in response to determining that the reference frame motion fields data omits the first portion of the motion field of the next most recently coded reference frame that is oriented toward the current frame, including the first portion of the motion field of the next most recently coded reference frame in the reference frame motion fields data as a candidate reference motion field.
4 . The method of claim 2 , wherein obtaining the reference frame motion fields data includes: in response to determining that the reference frame motion fields data includes the first portion of the motion field of the next most recently coded reference frame that is oriented toward the current frame: obtaining a portion of the motion field of the next most recently coded reference frame that is oriented away from the current frame; and including the portion of the motion field of the next most recently coded reference frame that is oriented away from the current frame in the reference frame motion fields data as a candidate reference motion field.
5 . The method of claim 1 , wherein obtaining the trajectory mapping data includes: obtaining a current reference motion vector from the reference frame motion fields data; determining whether to connect the current reference motion vector to previously identified trajectory data for reconstructing the current frame; and in response to determining to connect the current reference motion vector to the previously identified trajectory data, connecting the current reference motion vector to the previously identified trajectory data in trajectory mapping data for reconstructing the current frame.
6 . The method of claim 5 , wherein determining whether to connect the current reference motion vector to previously identified trajectory data for reconstructing the current frame includes: in response to determining that an endpoint of the current reference motion vector intersects with a previously identified trajectory from the previously identified trajectory data, connecting the current reference motion vector to the previously identified trajectory.
7 . The method of claim 5 , wherein obtaining the trajectory mapping data includes: determining whether to generate trajectory data for reconstructing the current frame in accordance with the current reference motion vector; in response to determining to generate the trajectory data: generating the trajectory data for reconstructing the current frame in accordance with the current reference motion vector; and including the trajectory data in trajectory mapping data for reconstructing the current frame.
8 . An apparatus comprising: a non-transitory computer-readable medium; and a processor configured to execute instructions stored on the non-transitory computer-readable medium to: generate reconstructed video data, wherein, to generate the reconstructed video data, the processor executes the instructions to decode an encoded bitstream, wherein, to decode the encoded bitstream, the processor executes the instructions to: obtain, from previously reconstructed reference frames available for reconstructing a current frame, reference frame motion fields data for reconstructing the current frame; obtaining, using the reference frame motion fields data, trajectory mapping data for reconstructing the current frame; access, from the encoded bitstream, current encoded block data for a current block of the current frame; obtain a motion vector prediction for the current block in accordance with the trajectory mapping data; obtain a differential motion vector from the current encoded block data; obtain, as a motion vector for the current block, a sum of the motion vector prediction and the differential motion vector; decode the current block in accordance with the motion vector to obtain decoded block data for the current block; obtain reconstructed frame data for the current frame in accordance with the decoded block data; and include the reconstructed frame data in the reconstructed video data; and output the reconstructed video data.
9 . The apparatus of claim 8 , wherein, to obtain the reference frame motion fields data, the processor executes the instructions to: include, in the reference frame motion fields data, less than or equal to a first defined maximum cardinality of candidate reference motion fields; and for the previously reconstructed reference frames: determine whether a current cardinality of candidate reference motion fields in the reference frame motion fields data is less than a second defined maximum cardinality; and in response to a determination that the current cardinality of candidate reference motion fields in the reference frame motion fields data is less than the second defined maximum cardinality: obtain, in coding recency order, a next most recently coded reference frame; and determine whether the reference frame motion fields data includes a first portion of a motion field of the next most recently coded reference frame that is oriented toward the current frame.
10 . The apparatus of claim 9 , wherein, to obtain the reference frame motion fields data, the processor executes the instructions to: in response to a determination that the reference frame motion fields data omits the first portion of the motion field of the next most recently coded reference frame that is oriented toward the current frame, include the first portion of the motion field of the next most recently coded reference frame in the reference frame motion fields data as a candidate reference motion field.
11 . The apparatus of claim 9 , wherein, to obtain the reference frame motion fields data, the processor executes the instructions to: in response to a determination that the reference frame motion fields data includes the first portion of the motion field of the next most recently coded reference frame that is oriented toward the current frame: obtain a portion of the motion field of the next most recently coded reference frame that is oriented away from the current frame; and include the portion of the motion field of the next most recently coded reference frame that is oriented away from the current frame in the reference frame motion fields data as a candidate reference motion field.
12 . The apparatus of claim 8 , wherein, to obtain the trajectory mapping data, the processor executes the instructions to: obtain a current reference motion vector from the reference frame motion fields data; determine whether to connect the current reference motion vector to previously identified trajectory data for reconstructing the current frame; and in response to a determination to connect the current reference motion vector to the previously identified trajectory data, connect the current reference motion vector to the previously identified trajectory data in trajectory mapping data for reconstructing the current frame.
13 . The apparatus of claim 12 , wherein, to determine whether to connect the current reference motion vector to previously identified trajectory data for reconstructing the current frame, the processor executes the instructions to: in response to a determination that an endpoint of the current reference motion vector intersects with a previously identified trajectory from the previously identified trajectory data, connect the current reference motion vector to the previously identified trajectory.
14 . The apparatus of claim 12 , wherein, to obtain the trajectory mapping data, the processor executes the instructions to: determine whether to generate trajectory data for reconstructing the current frame in accordance with the current reference motion vector; and in response to a determination to generate the trajectory data: generate the trajectory data for reconstructing the current frame in accordance with the current reference motion vector; and include the trajectory data in trajectory mapping data for reconstructing the current frame.
15 . A method comprising: generating an encoded bitstream by encoding a current frame from an input video stream, wherein encoding the current frame includes: obtaining, from previously reconstructed reference frames available for encoding the current frame, reference frame motion fields data for encoding the current frame; obtaining, using the reference frame motion fields data, trajectory mapping data for encoding the current frame; obtaining a motion vector prediction for a current block from the current frame in accordance with the trajectory mapping data; obtaining current encoded block data by encoding a current block of the current frame using a current motion vector; obtaining, as a differential motion vector, a result of subtracting the motion vector prediction from the current motion vector; and including, in the encoded bitstream, current encoded block data for the current block, wherein the current encoded block data for the current block includes the differential motion vector; and outputting the encoded bitstream.
16 . The method of claim 15 , wherein obtaining the reference frame motion fields data includes: including, in the reference frame motion fields data, less than or equal to a first defined maximum cardinality of candidate reference motion fields; and for the previously reconstructed reference frames: determining whether a current cardinality of candidate reference motion fields in the reference frame motion fields data is less than a second defined maximum cardinality; and in response to determining that the current cardinality of candidate reference motion fields in the reference frame motion fields data is less than the second defined maximum cardinality: obtaining, in coding recency order, a next most recently coded reference frame; and determining whether the reference frame motion fields data includes a first portion of a motion field of the next most recently coded reference frame that is oriented toward the current frame.
17 . The method of claim 16 , wherein obtaining the reference frame motion fields data includes: in response to determining that the reference frame motion fields data omits the first portion of the motion field of the next most recently coded reference frame that is oriented toward the current frame, including the first portion of the motion field of the next most recently coded reference frame in the reference frame motion fields data as a candidate reference motion field.
18 . The method of claim 16 , wherein obtaining the reference frame motion fields data includes: in response to determining that the reference frame motion fields data includes the first portion of the motion field of the next most recently coded reference frame that is oriented toward the current frame: obtaining a portion of the motion field of the next most recently coded reference frame that is oriented away from the current frame; and including the portion of the motion field of the next most recently coded reference frame that is oriented away from the current frame in the reference frame motion fields data as a candidate reference motion field.
19 . The method of claim 15 , wherein obtaining the trajectory mapping data includes: obtaining a current reference motion vector from the reference frame motion fields data; determining whether to connect the current reference motion vector to previously identified trajectory data for reconstructing the current frame; in response to determining to connect the current reference motion vector to the previously identified trajectory data, connecting the current reference motion vector to the previously identified trajectory data in trajectory mapping data for reconstructing the current frame; determining whether to generate trajectory data for reconstructing the current frame in accordance with the current reference motion vector; and in response to determining to generate the trajectory data: generating the trajectory data for reconstructing the current frame in accordance with the current reference motion vector; and including the trajectory data in trajectory mapping data for reconstructing the current frame.
20 . The method of claim 19 , wherein determining whether to connect the current reference motion vector to previously identified trajectory data for reconstructing the current frame includes: in response to determining that an endpoint of the current reference motion vector intersects with a previously identified trajectory from the previously identified trajectory data, connecting the current reference motion vector to the previously identified trajectory.

Description

CROSS-REFERENCE TO RELATED APPLICATION(S) This application claims priority to and the benefit of U.S. Provisional Application Patent Ser. No. 63/604,389, filed Nov. 30, 2023, the entire disclosure of which is hereby incorporated by reference. BACKGROUND Digital images and video can be used, for example, on the internet, for remote business meetings via video conferencing, high-definition video entertainment, video advertisements, or sharing of user-generated content. Due to the large amount of data involved in transferring and processing image and video data, high-performance compression may be advantageous for transmission and storage. Accordingly, it would be advantageous to provide high-resolution image and video transmitted over communications channels having limited bandwidth. SUMMARY This application relates to encoding and decoding of image data, video stream data, or both for transmission, storage, or both. Disclosed herein are aspects of systems, methods, and apparatuses for encoding and decoding using motion vector prediction with derived motion trajectory. Variations in these and other aspects will be described in additional detail hereafter. An aspect is a method for decoding using motion vector prediction with derived motion trajectory. Decoding using motion vector prediction with derived motion trajectory includes generating reconstructed video data by decoding an encoded bitstream and outputting the reconstructed video data. Decoding the encoded bitstream includes obtaining, from previously reconstructed reference frames available for reconstructing a current frame, reference frame motion fields data for reconstructing the current frame, obtaining, using the reference frame motion fields data, trajectory mapping data for reconstructing the current frame, accessing, from the encoded bitstream, current encoded block data for a current block of the current frame, obtaining a motion vector prediction for the current block in accordance with the trajectory mapping data, obtaining a differential motion vector from the current encoded block data, obtaining a motion vector for the current block by adding the motion vector prediction and the differential motion vector, decoding the current block using the motion vector to obtain decoded block data for the current block, obtaining reconstructed frame data for the current frame using the decoded block data, and including the reconstructed frame data in the reconstructed video data. An aspect is a method for encoding using motion vector prediction with derived motion trajectory. Encoding using motion vector prediction with derived motion trajectory includes generating an encoded bitstream by encoding a current frame from an input video stream and outputting the encoded bitstream. Encoding the current frame includes obtaining, from previously reconstructed reference frames available for encoding the current frame, reference frame motion fields data for encoding the current frame, obtaining, using the reference frame motion fields data, trajectory mapping data for encoding the current frame, obtaining a motion vector prediction for a current block from the current frame in accordance with the trajectory mapping data, obtaining current encoded block data by encoding a current block of the current frame using a current motion vector, obtaining, as a differential motion vector, a result of subtracting the motion vector prediction from the current motion vector, and including, in the encoded bitstream, current encoded block data for the current block, wherein the current encoded block data for the current block includes the differential motion vector. An aspect is an apparatus for encoding using motion vector prediction with derived motion trajectory. The apparatus includes a non-transitory computer readable medium, and a processor configured to execute instructions stored on the non-transitory computer readable medium to generate the encoded bitstream, wherein, to generate the encoded bitstream, the processor executes the instructions to encode a current frame from an input video stream and output the encoded bitstream. To encode the current frame the processor executes the instructions to obtain, from previously reconstructed reference frames available for encoding the current frame, reference frame motion fields data for encoding the current frame, obtain, using the reference frame motion fields data, trajectory mapping data for encoding the current frame, obtain a motion vector prediction for a current block from the current frame in accordance with the trajectory mapping data, obtain current encoded block data, wherein to obtain current encoded block data, the processor executes the instructions to encode a current block of the current frame using a current motion vector, obtain, as a differential motion vector, a result of subtracting the motion vector prediction from the current motion vector, and include, in the encoded bitstream, current encoded block