EP-4736432-A1 - LOOP FILTER TRAINING IMPROVEMENT
Abstract
A post-filter based on Neural Networks (NN) to replace one or several loop-filters or that can be added to the existing loop-filters, is provided. A convolutional neural network is added to, or replaces some of, the loop filters. In an embodiment, the neural network-based filter is applied to implement an Adaptive Loop Filter. A training phase generates a correction factor and scaling is implemented on the correction factor. Different filtering can be done on a per block, or per class, basis.
Inventors
- GALPIN, FRANCK
- BORDES, PHILIPPE
- DUMAS, Thierry
- BOISSON, GUILLAUME
Assignees
- InterDigital CE Patent Holdings, SAS
Dates
- Publication Date
- 20260506
- Application Date
- 20240613
Claims (15)
- 1. A method, comprising: determining a correction value from a neural network; modulating the correction value with a scaling factor; and, filtering a reconstructed portion of video by adding the reconstructed portion of video to the modulated correction value.
- 2. An apparatus, comprising: a memory, and a processor, configured to: determine a correction value from a neural network; modulate the correction value with a scaling factor; and, filter a reconstructed portion of video by adding the reconstructed portion of video to the modulated correction value.
- 3. The method of Claim 1 or the apparatus of Claim 2, wherein the scaling factor is signaled from an encoder to a corresponding decoder.
- 4. The method of any one of Claims 1 or 3, or the apparatus of any one of Claims 2 or 3, wherein an offset is added to the sum of the reconstructed portion of video and the modulated correction value.
- 5. The method of any one of Claims 1, 3, or 4, or the apparatus of any one of Claims 2, 3, or 4, wherein training of the neural network is performed initially with a fixed scaling and subsequently refined.
- 6. The method of any one of Claims 1, 3, 4, or 5, or the apparatus of any one of Claims 2, 3, 4, or 5, wherein said filtering is performed as an adaptive loop filter.
- 7. The method of any one of Claims 1, 3, 4, 5, or 6, or the apparatus of any one of Claims 2, 3, 4, 5, or 6, wherein a value of the scaling factor is associated with a class of filter that is implemented.
- 8. The method of any one of Claims 1 , 3, 4, 5, 6, or 7, or the apparatus of any one of Claims 2, 3, 4, 5, 6, or 7, wherein a code is signaled indicative of the scaling factor.
- 9. The method of any one of Claims 1 , 3, 4, 5, 6, 7, or 8, or the apparatus of any one of Claims 2, 3, 4, 5, 6, 7, or 8, wherein the scaling factor is signaled for each block of pixels.
- 10. The method of any one of Claims 1 , 3, 4, 5, 6, 7, 8, or 9, or the apparatus of any one of Claims 2, 3, 4, 5, 6, 7, 8, or 9, wherein the scaling factor is computed on the correction value and applied before determining a minimization of distortion between filtered samples and an original signal.
- 11 . The method of any one of Claims 1 , 3, 4, 5, 6, 7, 8, 9, or 10, or the apparatus of any one of Claims 2, 3, 4, 5, 6, 7, 8, 9, or 10, wherein a clipping or rounding operation is performed before applying a scaling factor.
- 12. A device comprising: an apparatus according to Claim 2; and at least one of (i) an antenna configured to receive a signal, the signal including the video block, (ii) a band limiter configured to limit the received signal to a band of frequencies that includes the video block, and (iii) a display configured to display an output representative of a video block.
- 13. A non-transitory computer readable medium containing data content generated according to the method of any one of claims 1 , or 3 through 11 , or by the apparatus of any one of claims 2, or 3 through 11 , for playback using a processor.
- 14. A signal comprising video data generated according to the method of any one of claims 1 , or 3 through 11 , or by the apparatus of any one of claims 2, or 3 through 11 , for playback using a processor.
- 15. A computer program product comprising instructions which, when the program is executed by a computer, cause the computer to carry out the method of any one of claims 1 , or 3 through 11.
Description
LOOP FILTER TRAINING IMPROVEMENT CROSS REFERENCE TO RELATED APPLICATION This application claims the benefit of European Serial No. 23306062.3 filed June 29, 2023, which is incorporated by reference herein in its entirety. TECHNICAL FIELD At least one of the present embodiments generally relates to a method or an apparatus for video encoding or decoding, compression or decompression. BACKGROUND To achieve high compression efficiency, image and video coding schemes usually employ prediction, including motion vector prediction, and transform to leverage spatial and temporal redundancy in the video content. Generally, intra or inter prediction is used to exploit the intra or inter frame correlation, then the differences between the original image and the predicted image, often denoted as prediction errors or prediction residuals, are transformed, quantized, and entropy coded. To reconstruct the video, the compressed data are decoded by inverse processes corresponding to the entropy coding, quantization, transform, and prediction. SUMMARY At least one of the present embodiments generally relates to a method or an apparatus for video encoding or decoding, and more particularly, to a method or an apparatus for coding or decoding using regressive-based affine bi-prediction weights. According to a first aspect, there is provided a method. The method comprises steps for determining a correction value from a neural network; modulating the correction value with a scaling factor; and, filtering a reconstructed portion of video by adding the reconstructed portion of video to the modulated correction value. According to a second aspect, there is provided another method. The method comprises the aforementioned steps, implemented for encoding or decoding. According to another aspect, there is provided an apparatus. The apparatus comprises a processor. The processor can be configured to operate on digital video data according to the aforementioned methods. According to another aspect, there is provided an apparatus. The apparatus comprises a processor. The processor can be configured to encode a block of a video or decode video data by executing any of the aforementioned methods. According to another general aspect of at least one embodiment, there is provided a device comprising an apparatus according to any of the decoding embodiments; and at least one of (i) an antenna configured to receive a signal, the signal including the video block, (ii) a band limiter configured to limit the received signal to a band of frequencies that includes the video block, or (iii) a display configured to display an output representative of a video block. According to another general aspect of at least one embodiment, there is provided a non-transitory computer readable medium containing data content generated according to any of the described encoding embodiments or variants. According to another general aspect of at least one embodiment, there is provided a signal comprising video data generated according to any of the described encoding embodiments or variants. According to another general aspect of at least one embodiment, video data or a bitstream is formatted to include data content generated according to any of the described encoding embodiments or variants. According to another general aspect of at least one embodiment, there is provided a computer program product comprising instructions which, when the program is executed by a computer, cause the computer to carry out any of the described decoding embodiments or variants. These and other aspects, features and advantages of the general aspects will become apparent from the following detailed description of exemplary embodiments, which is to be read in connection with the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 illustrates an example loop filtering process in typical video codecs. Figure 2 illustrates an example CNN loop filter process in NNVC. Figure 3 illustrates an example learning process. Figure 4 illustrates an example of symmetrical filter (left) and filter rotation (right). Figure 5 illustrates a modified model architecture used during training. Figure 6 illustrates a model used for training with clipping. Figure 7 illustrates a model used for training with quantization. Figure 8 illustrates an example of ALF computation. Figure 9 illustrates an ALF variant with clipping during the training. Figure 10 illustrates one embodiment of a first method under the described aspects. Figure 11 illustrates one embodiment of a second method under the described aspects. Figure 12 illustrates one embodiment of an apparatus under the described aspects. Figure 13 illustrates a standard, generic, video compression scheme. Figure 14 illustrates a standard, generic, video decompression scheme. Figure 15 illustrates a processor-based system for encoding/decoding under the general described aspects. DETAILED DESCRIPTION The embodiments described here are in the field