Search

US-12627836-B2 - Deep intra predictor generating side information

US12627836B2US 12627836 B2US12627836 B2US 12627836B2US-12627836-B2

Abstract

At least a method and an apparatus are presented for efficiently encoding or decoding video. For example, an intra prediction of an image block and side information are determined using at least one neural network from a context comprising pixels surrounding the image block. The side information allows a decoder to determine the intra prediction and is signaled for the decoding.

Inventors

  • Thierry Dumas
  • Franck Galpin
  • Jean Begaint
  • Fabien Racape

Assignees

  • INTERDIGITAL MADISON PATENT HOLDINGS, SAS

Dates

Publication Date
20260512
Application Date
20201009
Priority Date
20191011

Claims (18)

  1. 1 . A method comprising: decoding from a bitstream, for a luma block being decoded in a picture of a video, a side information for a neural network intra prediction, the side information and the neural network intra prediction having been generated at encoding by applying a neural network to the luma block and to a context surrounding the luma block, the side information comprising neural network feature values encoded into the bitstream; determining, for the luma block being decoded, a luma component of the neural network intra prediction by applying a neural network to the side information and to the context surrounding the luma block, wherein the side information guides the neural network intra prediction at the decoding; and decoding the luma block using the determined neural network intra prediction.
  2. 2 . The method of claim 1 , further comprising: decoding, for a chroma block being decoded in the picture of the video, a side information for the neural network intra prediction, the side information and the neural network intra prediction having been generated at encoding by applying the neural network to a context surrounding the chroma block and the context surrounding the luma block; and determining, for the chroma block, a chroma component of the neural network intra prediction by applying a neural network to the context surrounding the chroma block, the context surrounding the luma block, and the side information.
  3. 3 . The method of claim 1 , further comprising: decoding, for a chroma block being decoded in the picture of the video, a side information for the neural network intra prediction, the side information and the neural network intra prediction having been generated at encoding by applying the neural network to a context surrounding a luma block that is collocated with the chroma block; and determining, for the chroma block, a chroma component of the neural network intra prediction by applying a neural network to the context surrounding the luma block that is collocated with the chroma block and the side information.
  4. 4 . The method of claim 1 further comprising: decoding a syntax element indicating that a neural network-based intra prediction is used for intra prediction of the luma block.
  5. 5 . An apparatus comprising one or more processors, wherein the one or more processors are configured to: decode from a bitstream, for a luma block being decoded in a picture of a video, a side information for a neural network intra prediction, the side information and the neural network intra prediction having been generated at encoding by applying a neural network to the luma block and to a context surrounding the luma block, the side information comprising neural network feature values encoded into the bitstream; determine, for the luma block being decoded, a luma component of the neural network intra prediction by applying a neural network to the side information and to the context surrounding the luma block, wherein the side information guides the neural network intra prediction at the decoding; and decode the luma block using the determined neural network intra prediction.
  6. 6 . The apparatus of claim 5 , wherein the one or more processors are configured to: decode, for a chroma block being decoded in the picture of the video, a side information for the neural network intra prediction, the side information and the neural network intra prediction having been generated at encoding by applying the neural network to a context surrounding the chroma block and a context surrounding a luma block that is collocated with the chroma block; and determine, for the chroma block, a chroma component of the neural network intra prediction by applying a neural network to the context surrounding the chroma block, the context surrounding the luma block, and the side information.
  7. 7 . The apparatus of claim 5 , wherein the one or more processors are configured to: decode, for a chroma block being decoded in the picture of the video, a side information for the neural network intra prediction, the side information and the neural network intra prediction having been generated at encoding by applying a neural network to a context surrounding a luma block that is collocated with the chroma block; determine, for the chroma block, a chroma component of the neural network intra prediction and side information by applying a neural network to the a context surrounding the luma block and the side information.
  8. 8 . The apparatus of claim 5 , wherein the one or more processors are configured to: decode a syntax element indicating that a neural network-based intra prediction is used for intra prediction of the luma block.
  9. 9 . A method for video encoding, comprising: determining, from a luma block being encoded in a picture of a video, by applying a neural network to the luma block and to a context surrounding the luma block, a luma component of a neural network intra prediction and a side information to guide neural network intra prediction at decoding, the side information comprising neural network feature values; encoding the luma block based on the neural network intra prediction; and encoding the side information into a bitstream.
  10. 10 . The method of claim 9 , further comprising: determining, for a chroma block collocated with the luma block being encoded in the picture of the video, a chroma component of the neural network intra prediction and side information by applying the neural network to the chroma block and to the context surrounding the luma block; and encoding the chroma block based on the determined neural network intra prediction.
  11. 11 . The method of claim 9 , further comprising: determining, for a chroma block being encoded in the picture of the video, a chroma component of the neural network intra prediction and side information by applying the neural network to a context surrounding the chroma block and the context surrounding the luma block; and encoding the chroma block based on the determined neural network intra prediction.
  12. 12 . The method of claim 9 , further comprising: determining, for a chroma block being encoded in the picture of the video, a chroma component of the neural network intra prediction and side information by applying the neural network to a context surrounding a luma block that is collocated with the chroma block; and encoding the chroma block based on the determined neural network intra prediction.
  13. 13 . The method of claim 9 , further comprising: encoding a syntax element indicating that a neural network-based intra prediction is used for intra prediction of the luma block.
  14. 14 . An apparatus for video encoding, comprising one or more processors, wherein the one or more processors are configured to: determine, from a luma block being encoded in a picture of a video, by applying a neural network to the luma block and to a context surrounding the luma block, a luma component of a neural network intra prediction and a side information to guide neural network intra prediction at decoding, the side information comprising neural network feature values; encode the luma block based on the neural network intra prediction; and encode the side information into a bitstream.
  15. 15 . The apparatus of claim 14 , wherein the one or more processors are configured to: determine, for a chroma block collocated with the luma block being encoded in the picture of the video, a chroma component of the neural network intra prediction and side information by applying the neural network to the chroma block and to the context surrounding the luma block; and encode the chroma block based on the determined neural network intra prediction.
  16. 16 . The apparatus of claim 14 , wherein the one or more processors are configured to: determine, for a chroma block being encoded in the picture of the video, a chroma component of the neural network intra prediction and side information by applying the neural network to a context surrounding the chroma block and the context surrounding the luma block; and encode the chroma block based on the determined neural network intra prediction.
  17. 17 . The apparatus of claim 14 , wherein the one or more processors are configured to: determine, for a chroma block being encoded in the picture of the video, a chroma component of the neural network intra prediction and side information by applying the neural network to a context surrounding a luma block that is collocated with the chroma block; and encode the chroma block based on the determined neural network intra prediction.
  18. 18 . The apparatus of claim 14 , wherein the one or more processors are configured to: encode a syntax element indicating that a neural network-based intra prediction is used for intra prediction of the luma block.

Description

CROSS REFERENCE TO RELATED APPLICATIONS This application is a U.S. National Stage Application under 35 U.S.C. 371 of International Patent Application No. Application No. PCT/EP2020/078460, filed Oct. 9, 2020, which is incorporated herein by reference in its entirety. This application claims the benefit of European Patent Application No. 19306330.2, filed Oct. 11, 2019, which is incorporated herein by reference in its entirety. TECHNICAL FIELD At least one of the present embodiments generally relates to a method or an apparatus for video encoding or decoding, and more particularly, to a method or an apparatus that obtains a neural network intra prediction and side information using a neural network from at least one input data; and encodes or decodes the side information and an image block using the neural network intra prediction. BACKGROUND To achieve high compression efficiency, image and video coding schemes usually employ prediction, including motion vector prediction, and transform to leverage spatial and temporal redundancy in the video content. Generally, intra or inter prediction is used to exploit the intra or inter frame correlations, then the difference between an original image block and its prediction, often denoted as prediction error or prediction residual, is transformed, quantized, and entropy coded. To reconstruct the video, the compressed data are decoded by inverse processes corresponding to the entropy coding, quantization, transform, and prediction. Recent additions to video compression technology include various industry standards, versions of the reference software and/or documentations such as Joint Exploration Model (JEM) and later VTM (Versatile Video Coding (VVC) Test Model) being developed by the JVET (Joint Video Exploration Team) group. The aim is to make further improvements to the existing HEVC (High Efficiency Video Coding) standard. Recent works introduce deep neural networks for improving video compression efficiency in terms of bitrate savings. For instance, a deep intra predictor infers from the context surrounding the current block to be predicted a prediction of this block. According to previous works related to the learning of a deep intra predictor for image and video compression, when the learned deep intra predictor is inserted into a video codec, such as HEVC for instance, it is always at least one additional intra prediction mode, in competition with the existing ones. Indeed, the intra prediction component of video codecs cannot rely on the deep intra predictor alone because of the “failure cases”. A “failure case” refers to a situation where the learned deep intra predictor infers a prediction of the current block of relatively low quality, compared to the prediction with the best quality among the predictions provided by the regular intra prediction modes in the video codec of interest. In general, a “failure case” occurs when the information in the context is not enough for inferring a prediction of the current block of good quality or, roughly speaking, the context is too decorrelated from the current block. SUMMARY The drawbacks and disadvantages of the prior art are solved and addressed by the general aspects described herein, which are directed to a deep intra predictor generating side information. According to at least one embodiment, the deep intra predictor generates side information on the encoder side, the side information is written to the bitstream, and the deep intra prediction reads the side information on the decoder side. This way, it is possible to transmit from the encoder to the decoder information to supplement the information contained in the context surrounding the current block to be predicted. According to a first aspect, there is provided a method. The method comprises determining, for a block being encoded, a neural network intra prediction and side information using a neural network from at least one input data; encoding the block based on the neural network intra prediction; and encoding the side information. According to another aspect, there is provided a second method. The method comprises obtaining, for a block being decoded, side information relative to a neural network intra prediction; determining, for the block being decoded, a neural network intra prediction using a neural network applied to at least one input data and the side information; and decoding the block using said determined neural network intra prediction. According to another aspect, there is provided an apparatus. The apparatus comprises one or more processors, wherein the one or more processors are configured to: determine, for a block being encoded, a neural network intra prediction and side information using a neural network from at least one input data; encode the block based on the neural network intra prediction; and encode the side information. According to another aspect, there is provided another apparatus. The apparatus comprises one or more processo