EP-4736411-A1 - NEW HYPOTHESIS FOR MULTI-HYPOTHESIS INTER PREDICTION MODE
Abstract
A method for decoding a current block of a current picture from video data comprising: obtaining (801) a first motion information of a first predictor for the current block; deriving (802) a second motion information for at least one second predictor from the first motion information of the first predictor; and, generating (803) a final predictor for the current block using at least the first and each second predictor.
Inventors
- REUZE, Kevin
- GALPIN, FRANCK
- ROBERT, ANTOINE
- NASER, Karam
Assignees
- InterDigital CE Patent Holdings, SAS
Dates
- Publication Date
- 20260506
- Application Date
- 20240610
Claims (1)
- 2023PF00496 Claims 1. A method for encoding a current block of a current picture in video data comprising: obtaining (801) a first motion information of a first predictor for the current block; deriving (802) a second motion information for at least one second predictor from the first motion information of the first predictor; and, generating (803) a final predictor for the current block using at least the first and each second predictor. 2. A method for decoding a current block of a current picture from video data comprising: obtaining (801) a first motion information of a first predictor for the current block; deriving (802) a second motion information for at least one second predictor from the first motion information of the first predictor; and, generating (803) a final predictor for the current block using at least the first and each second predictor. 3. The method of claim 1 or 2 wherein, the first predictor is a base predictor obtained first for the current block or an additional predictor obtained for the current block after the base predictor. 4. The method of claim 3 wherein, the second motion information comprises a second motion vector the coordinates of which depending on coordinates of a first motion vector comprised in the first motion information weighted by a weighting factor. 5. The method of claim 4 wherein, the second motion information comprises a second reference picture index on a second reference picture, the second reference picture index depending on a first reference picture index on a first reference picture comprised in the first motion information and on the weighting factor. 2023PF00496 6. The method of any previous claim wherein, the second motion information results from an application of a refinement process to an intermediate motion information derived from the first motion information, the refinement process allowing obtaining a motion information refinement of the intermediate motion information, the second motion information being a sum of the intermediate motion information and the motion information refinement. 7. The method of claim 6 wherein, the motion information refinement is obtained by minimizing a sum of absolute difference between the first predictor and the second predictor. 8. The method of claim 6 wherein, the motion information refinement is obtained by minimizing a sum of absolute difference between the current block and the second predictor, the motion information refinement being signaled in the video data. 9. The method of claim 6 wherein, the motion information refinement is obtained by a template matching process. 10. The method of any previous claim from claim 4 to 9 wherein, the first motion information is a bi-prediction motion information comprising two first pairs, each first pair comprising a motion vector and a reference picture index, the second motion information comprising two second pairs, each second pair being derived from one of the first pairs, a second predictor being obtained from each second pair. 11. The method of claim 10 wherein, the final predictor is based on at least one of the second predictors. 12. The method of claim 11 wherein, the final predictor is based on one of the second predictors selected based on a criterion. 13. The method of claim 12 wherein the criterion depends on a mode used for obtaining the first predictor. 2023PF00496 14. The method of any previous claim wherein, a syntax element signals that the second predictor is obtained based on second motion information derived from the first motion information. 15. The method of any previous claims wherein, a use of the second predictor obtained based on second motion information derived from the first motion information is allowed by a high level syntax. 16. A device for encoding a current block of a current picture in video data comprising electronic circuitry configured for: obtaining (801) a first motion information of a first predictor for the current block; deriving (802) a second motion information for at least one second predictor from the first motion information of the first predictor; and, generating (803) a final predictor for the current block using at least the first and each second predictor. 17. A device for decoding a current block of a current picture from video data comprising electronic circuitry configured for: obtaining (801) a first motion information of a first predictor for the current block; deriving (802) a second motion information for at least one second predictor from the first motion information of the first predictor; and, generating (803) a final predictor for the current block using at least the first and each second predictor. 18. The device of claim 16 or 17 wherein, the first predictor is a base predictor obtained first for the current block or an additional predictor obtained for the current block after the base predictor. 19. The device of claim 18 wherein, the second motion information comprises a second motion vector the coordinates of which depending on coordinates of a first motion vector comprised in the first motion information weighted by a weighting factor. 2023PF00496 20. The device of claim 19 wherein, the second motion information comprises a second reference picture index on a second reference picture, the second reference picture index depending on a first reference picture index on a first reference picture comprised in the first motion information and on the weighting factor. 21. The device of any previous claim from claim 16 to 20 wherein, the second motion information results from an application of a refinement process to an intermediate motion information derived from the first motion information, the refinement process allowing obtaining a motion information refinement of the intermediate motion information, the second motion information being a sum of the intermediate motion information and the motion information refinement. 22. The device of claim 21 wherein, the motion information refinement is obtained by minimizing a sum of absolute difference between the first predictor and the second predictor. 23. The device of claim 21 wherein, the motion information refinement is obtained by minimizing a sum of absolute difference between the current block and the second predictor, the motion information refinement being signaled in the video data. 24. The device of claim 21 wherein, the motion information refinement is obtained by a template matching process. 25. The device of any previous claim from claim 19 to 24 wherein, the first motion information is a bi-prediction motion information comprising two first pairs, each first pair comprising a motion vector and a reference picture index, the second motion information comprising two second pairs, each second pair being derived from one of the first pairs, a second predictor being obtained from each second pair. 26. The device of claim 25 wherein, the final predictor is based on at least one of the second predictors. 2023PF00496 27. The device of claim 26 wherein, the final predictor is based on one of the second predictors selected based on a criterion. 28. The device of claim 27 wherein the criterion depends on a mode used for obtaining the first predictor. 29. The device of any previous claim from claim 16 to 28 wherein, a syntax element signals that the second predictor is obtained based on second motion information derived from the first motion information. 30. The device of any previous claim from claim 16 to 29 wherein, a use of the second predictor obtained based on second motion information derived from the first motion information is allowed by a high level syntax. 31. Non-transitory information storage medium storing program code instructions for implementing the method according to any previous claims from claim 1 to 15. 32. A computer program comprising program code instructions for implementing the method according to any previous claims from claim 1 to 15. 33. A signal generated by the method of claim 1 or 3 to 15 when depending on claim 1 or by the device of claim 16 or 18 to 30 when depending on claim 16.
Description
2023PF00496 NEW HYPOTHESIS FOR MULTI-HYPOTHESIS INTER PREDICTION MODE 1. CROSS REFERENCE TO RELATED APPLICATIONS This application claims priority to European Application No.23306028.4, filed June 27, 2023, which is incorporated herein by reference in its entirety. 2. TECHNICAL FIELD At least one of the present embodiments generally relates to a method and a device for improving a multi-hypothesis inter prediction mode in video compression methods. 3. BACKGROUND To achieve high compression efficiency, video coding schemes usually employ predictions and transforms to leverage spatial and temporal redundancies in a video content. During an encoding, pictures of the video content are divided into blocks of samples (i.e., Pixels), these blocks being then partitioned into one or more sub-blocks, called original sub-blocks in the following. An intra or inter prediction is then applied to each sub-block to exploit intra or inter image correlations. Whatever the prediction method used (intra or inter), a predictor sub-block is determined for each original sub- block. Then, a sub-block representing a difference between the original sub-block and the predictor sub-block, often denoted as a prediction error sub-block, a prediction residual sub-block or simply a residual sub-block, is transformed, quantized and entropy coded to generate an encoded video stream. To reconstruct the video, the compressed data is decoded by inverse processes corresponding to the transform, quantization and entropic coding. Inter prediction consists in predicting a current block of a current picture from at least one predictor block from a reference picture preceding or following the current picture. A predictor block is identified in the reference picture by motion information. Recently, a new inter prediction mode called multi-hypothesis inter prediction (MHP) mode had been proposed (see document JVET-M0425, CE10: Multi-hypothesis inter prediction (Test 10.1.2), Martin Winken, Heiko Schwarz, Detlev Marpe, Thomas 2023PF00496 Wiegand, Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 1113th Meeting: Marrakech, MA, 9–18 Jan.2019). In the MHP mode, in addition to a traditional mono-prediction or bi-prediction predictor, one or more additional inter predictors (also called additional prediction hypothesis) are signaled. A resulting overall predictor is obtained for the current block by a sample-wise weighted superposition. Motion parameters of each additional prediction hypothesis can be signaled either explicitly by specifying a reference index, a motion vector predictor index, and a motion vector difference, or implicitly by specifying a merge index. These explicit or implicit signaling are considered sub-optimal because none of these signaling consider the mode nor motion vector already in use by the current block. It is desirable to propose solutions allowing to overcome the above issues. In particular, it is desirable to propose solutions improving the MHP mode by considering the base prediction of the current block. 4. BRIEF SUMMARY In a first aspect, one or more of the present embodiments provide a method for encoding a current block of a current picture in video data comprising: obtaining a first motion information of a first predictor for the current block; deriving a second motion information for at least one second predictor from the first motion information of the first predictor; and, generating a final predictor for the current block using at least the first and each second predictor. In a second aspect, one or more of the present embodiments provide a method for decoding a current block of a current picture from video data comprising: obtaining a first motion information of a first predictor for the current block; deriving a second motion information for at least one second predictor from the first motion information of the first predictor; and, generating a final predictor for the current block using at least the first and each second predictor. 2023PF00496 In an embodiment of the first or second aspect, the first predictor is a base predictor obtained first for the current block or an additional predictor obtained for the current block after the base predictor. In an embodiment of the first or second aspect, the second motion information comprises a second motion vector the coordinates of which depending on coordinates of a first motion vector comprised in the first motion information weighted by a weighting factor. In an embodiment of the first or second aspect, the second motion information comprises a second reference picture index on a second reference picture, the second reference picture index depending on a first reference picture index on a first reference picture comprised in the first motion information and on the weighting factor. In an embodiment of the first or second aspect, the second motion information results from an application of a refinement process to an intermediate motion