EP-4740463-A1 - REFERENCE FILTERING FOR INTER PREDICTION

EP4740463A1EP 4740463 A1EP4740463 A1EP 4740463A1EP-4740463-A1

Abstract

A video coder (encoder or decoder) determines, based on illumination compensation being enabled for a reference block, a multiple-tap filter model. The video coder further determines a plurality of coefficients of the multiple-tap filter model, based on a first plurality of template samples of a template of the block in a picture and a second plurality of template samples of a reference template of the reference block in a reference picture different from the picture. The video coder applies a multiple-tap filter, corresponding to the multiple-tap filter model with the determined plurality of coefficients, to the reference block to generate a predicted block, and codes the block based on the predicted block.

Inventors

FILIPPOV, Alexey Konstantinovich
RUFITSKIY, Vasily Alexeevich
Dinan, Esmael Hejazi

Assignees

Ofinno, LLC

Dates

Publication Date: 20260513
Application Date: 20240627

Claims (1)

Docket No.: 23-2028PCT CLAIMS What is claimed is: 1. A method comprising: determining, based on illumination compensation being enabled for a reference block, a multiple-tap filter model to be applied to the reference block to generate a predicted block for reconstructing a block, wherein the multiple- tap filter model is determined from a plurality of filter models that comprises the multiple-tap filter model and a linear filter model with a single spatial component and a bias term; determining a plurality of coefficients of the multiple-tap filter model, based on: a first plurality of template samples of a template of the block in a picture; and a second plurality of template samples of a reference template of the reference block in a reference picture different from the picture; applying a multiple-tap filter, corresponding to the multiple-tap filter model with the determined plurality of coefficients, to the reference block to generate the predicted block; and coding the block based on the predicted block. 2. A method comprising: determining, based on illumination compensation being enabled for a reference block, a multiple-tap filter model ; determining a plurality of coefficients of the multiple-tap filter model, based on: a first plurality of template samples of a template of the block in a picture; and a second plurality of template samples of a reference template of the reference block in a reference picture different from the picture; applying a multiple-tap filter, corresponding to the multiple-tap filter model with the determined plurality of coefficients, to the reference block to generate a predicted block; and coding the block based on the predicted block. 3. The method according to claim 2, wherein the multiple-tap filter model is determined from a plurality of filter models. 4. The method according to claim 3, wherein the plurality of filter models comprises the multiple-tap filter model and a linear filter model with a single spatial component and a bias term. 5. The method according to any one of claims 1 or 2, wherein the multiple-tap filter model comprises two or more spatial components and a bias component. 6. The method according to any one of claims 1-5, wherein the multiple-tap filter model comprises a linear filter model. 7. The method according to any one of claims 1-5, wherein the multiple-tap filter model comprises a linear filter model comprising one or more components with a non-linear function. Docket No.: 23-2028PCT 8. The method according to any one of claims 1-5, wherein the multiple-tap filter model comprises a derivative filter model of an n-th order, wherein n is a positive integer. 9. The method according to any one of claims 1-5, wherein the multiple-tap filter model comprises a combination of linear filter models. 10. The method according to any one of claims 1-9, wherein the determining the plurality of coefficients is based on a minimum squared error technique applied between the first plurality of template samples and a plurality of filtered samples output from the multiple-tap filter model applied to the second plurality of template samples. 11. The method according to any one of claims 1-10, wherein the template comprises a plurality of columns of samples nearest a left edge of the block and a plurality of rows of samples nearest a top edge of the block, and the reference template comprises a plurality of columns of samples nearest a left edge of the reference block and a plurality of rows of samples nearest a top edge of the reference block. 12. The method according to any one of claims 1-11, wherein the second plurality of template samples comprises samples adjacent to the reference template. 13. The method according to claim 12, wherein the samples adjacent to the reference template comprise padded samples. 14. The method according to claim 12, wherein the samples adjacent to the reference template comprise reconstructed samples. 15. The method according to any one of claims 1-14, wherein the multiple-tap filter model comprises a plurality of spatial components comprising: a spatial component for a target sample on which the multiple-tap filter is applied, and a spatial component for each selected sample adjacent to the target sample. 16. The method according to any one of claims 1-15, wherein the determining the plurality of coefficients comprises: calculating, based on the first plurality of template samples and the second plurality of template samples, a respective coefficient corresponding to each filter position of the multiple-tap filter. 17. The method according to any one of claims 1-16, wherein the applying the multiple-tap filter, corresponding to the multiple-tap filter model with the determined plurality of coefficients, to the reference block to generate the predicted block comprises: calculating each predicted sample of the predicted block based at least on the determined plurality of coefficients, a reference sample from the reference block, and a bias term. 18. The method according to any one of claims 1-16, wherein the applying the multiple-tap filter, corresponding to the multiple-tap filter model with the determined plurality of coefficients, to the reference block to generate the predicted block comprises: calculating each predicted sample of the predicted block based at least on the determined plurality of coefficients, a reference sample from the reference block, one or more non-linear functions of the reference sample, and a bias term. Docket No.: 23-2028PCT 19. The method according to claim 18, wherein the one or more non-linear functions of the reference sample comprises at least a first-order derivative of the reference sample or a second-order derivative of the reference sample. 20. The method according to any one of claims 1-19, wherein the multiple-tap filter model comprises 5 spatial components. 21. The method according to any one of claims 1-20, wherein the multiple-tap filter model comprises 9 spatial components. 22. The method according to any one of claims 1-21, wherein the multiple-tap filter model comprises 17 spatial components. 23. The method according to any one of claims 1-20, wherein the multiple-tap filter model comprises a plurality of spatial components arranged in a cross shape with a target sample, on which the multiple-tap filter model is applied, being in a center of the cross shape. 24. The method according to any one of claims 1-20, wherein the multiple-tap filter model comprises a plurality of spatial components corresponding to samples arranged in an x-cross shape with a target sample, on which the multiple-tap filter model is applied, being in a center of the x-cross shape. 25. The method according to any one of claims 1-22, wherein the multiple-tap filter model comprises a plurality of spatial components arranged in a rectangular shape with a target sample, on which the multiple-tap filter model is applied, being in a center of the rectangular shape. 26. The method according to any one of claims 1-25, wherein the determining a plurality of coefficients of the multiple- tap filter model comprises: calculating at least one of a plurality of first-order derivatives or a plurality higher-order derivatives from the second plurality of template samples; and determining the plurality of coefficients based on the first plurality of template samples, the second plurality of template samples, and the at least one of the plurality of first-order derivatives or the plurality higher-order derivatives. 27. The method according to claim 26, wherein the calculating at least one of a plurality of first order derivatives or a plurality second order derivatives comprises: calculating, for each of respective samples in the template or the reference template, a respective plurality of the first-order derivatives or the higher-order derivatives corresponding to the reference sample. 28. The method according to any one of claims 1 or 3-27, wherein the multiple-tap filter model is determined from a plurality of filter models by: for each filter model of the plurality of filter models, determining parameters for the filter model based on a first area of the reference template and the template, Docket No.: 23-2028PCT applying a filter, corresponding to the filter model with the determined parameters, to a second area of the reference template and the template to determine an error associated with the filter model; and determining, from the plurality of filter models, a filter mode having a smaller error than another filter model of the plurality of filter models as the multiple-tap filter model. 29. The method according to any one of claims 1-28, wherein the multiple-tap filter model is determined in accordance with a selected merge candidate based on merge mode inter prediction being used for the block. 30. The method according to claim 29, wherein the determining a plurality of coefficients of the multiple-tap filter model comprises determining, based on relative block sizes of a block of the selected merge candidate and the block, the plurality of coefficients of the multiple-tap filter model. 31. The method of claim 30, wherein the determining a plurality of coefficients of the multiple-tap filter model comprises inferring the plurality of coefficients of the multiple-tap filter model from a plurality of coefficients of a multiple-tap filter model of the block of the selected merge candidate based on the block having a size smaller than the block of the selected merge candidate. 32. The method according to any one of claims 1-31, wherein the coding the block based on the predicted block comprises reconstructing the block based on a prediction error decoded from a bitstream and the predicted block. 33. The method according to any one of claims 1-32, further comprising: decoding, from a bitstream, a first indication indicating that illumination compensation is enabled for the reference block. 34. The method according to claim 33, further comprising: decoding, from the bitstream, an identification of the multiple-tap filter model. 35. The method according to claim 34, wherein the identification of the multiple-tap filter model is decoded from a codeword wherein a relative length of the codeword is determined in accordance with a relative number of parameters in a parametric model associated with the multiple-tap filter. 36. The method according to any one of claims 1-31, further comprising: encoding a first indication indicating that illumination compensation is enabled for the reference block. 37. The method according to any one of claims 1-31 or 36, further comprising encoding, in a bitstream, an identification of the multiple-tap filter model. 38. The method of any one of claims 1-31, wherein the coding the block based on the predicted block comprise: encoding, in a bitstream, a prediction error based on the predicted block and the block. 39. An encoder comprising: one or more processors; and memory storing instructions that, when executed by the one or more processors, cause the encoder to perform the method of any one of claims 1-31 and 36-38. 40. A decoder comprising: one or more processors; and Docket No.: 23-2028PCT memory storing instructions that, when executed by the one or more processors, cause the decoder to perform the method of any one of claims 1-35. 41. A non-transitory computer-readable medium comprising instructions that, when executed by one or more processors of an apparatus, cause the apparatus to perform the method of any one of claims 1-38.

Description

Docket No.: 23-2028PCT TITLE Reference Filtering for Inter Prediction CROSS-REFERENCE TO RELATED APPLICATIONS [0001] This application claims the benefit of U.S. Provisional Application No.63/525,670, filed July 8, 2023, which is hereby incorporated by reference in its entirety. BRIEF DESCRIPTION OF THE DRAWINGS [0002] Some features are shown by way of example, and not by limitation, in the accompanying drawings. In the drawings, like numerals reference similar elements. [0003] FIG.1 shows an example video coding/decoding system in which embodiments of the present disclosure may be implemented. [0004] FIG.2 shows an example encoder in which embodiments of the present disclosure may be implemented. [0005] FIG.3 shows an example decoder in which embodiments of the present disclosure may be implemented. [0006] FIG.4 shows an example quadtree partitioning of a coding tree block (CTB). [0007] FIG.5 shows an example quadtree corresponding to the example quadtree partitioning of the CTB in FIG.4. [0008] FIG.6 show examples of binary tree and ternary tree partitions. [0009] FIG.7 shows an example of combined quadtree and multi-type tree partitioning of a CTB. [0010] FIG.8 shows an example tree corresponding to the combined quadtree and multi-type tree partitioning of the CTB shown in FIG.7. [0011] FIG.9 shows an example set of reference samples determined for intra prediction of a current block. [0012] FIGS.10A and 10B show example intra prediction modes. [0013] FIG.11 shows an example of a current block and corresponding reference samples. [0014] FIG.12 shows an example of applying an intra prediction mode (e.g., an angular mode) for prediction of a current block. [0015] FIG.13A shows an example of inter prediction performed for a current block in a current picture. [0016] FIG.13B shows an example motion vector. [0017] FIG.14 shows an example of bi-prediction performed for a current block. [0018] FIG.15A shows example spatial candidate neighboring blocks relative to a current block being coded. [0019] FIG.15B shows example locations of two temporal, co-located blocks relative to a current block. [0020] FIG.16 shows an example of intra block copy (IBC). [0021] FIG.17A shows an example of a current block and a reference block with corresponding templates used in determining scale and offset parameters for local illumination compensation (LIC) during inter prediction, according to some embodiments. [0022] FIG.17B shows a process for generating a predicted block when LIC is used during inter prediction, according to some embodiments. Docket No.: 23-2028PCT [0023] FIG.18A shows a process for generating a predicted block by applying an illumination compensation function (e.g., a filter model) that uses a multiple-tap filter to generate predicted samples from samples of the reference template, according to some embodiments. [0024] FIG.18B shows an example reference template and example multiple-tap filter models that can be applied to the reference template to generate a multi-tap filter, according to some embodiments. [0025] FIG.19A shows an example process of using the gradients of multiple-tap filter samples to obtain illumination compensated predicted samples for inter prediction, according to some embodiments. [0026] FIG.19B shows examples of first-order derivatives and second-order derivatives of example samples for filtering, according to some embodiments. [0027] FIG.20 shows a flowchart of a process of signaling illumination compensation, according to some embodiments. [0028] FIG.21 shows a flowchart of a process by which the decoder can determine, when using inter prediction merge mode, whether illumination compensation based on multi-parametric reference filtering (MPRF) is used for the block, according to some embodiments. [0029] FIG.22 shows an example coding scheme to efficiently encode and decode an indication of filter models (e.g., an MPRF model), according to some embodiments. [0030] FIG.23A shows a flowchart of a method of deriving a filter model to be applied for illumination compensation in inter prediction, according to some embodiments. [0031] FIG.23B shows an example reference template format and a current template format that are used to derive the filter model of a plurality of filter models in the method shown in the flowchart of FIG.23A, according to some embodiments. [0032] FIG.24 shows an example of when filter model parameters can be obtained from a merge candidate, according to some embodiments. [0033] FIG.25A and FIG.25B show example block sizes of merge candidate blocks that can be considered when deciding whether to obtain filter model parameters from one of the merge candidates, according to some embodiments. [0034] FIG.26 shows a flowchart of a method for applying illumination compensation based on MPRF in inter prediction, according to some embodiments. [0035] FIG.27 shows a flowchart of a method for determining a current block that has been encoded by applying illumination compensat