Search

CN-122003868-A - Purpose and various constraints of a neural network post-processing filter bank

CN122003868ACN 122003868 ACN122003868 ACN 122003868ACN-122003868-A

Abstract

A mechanism for processing video data is disclosed. The mechanism includes determining that a plurality of neural network post-processing filters (NNPF) or a neural network post-processing filter bank (NNPFG) defined in one or more Supplemental Enhancement Information (SEI) messages share the same NNPF purpose. Conversion is performed between the visual media data and the bitstream based on NNPF purposes.

Inventors

  • JIA WEI
  • WANG YEKUI
  • ZHANG KAI
  • ZHANG LI

Assignees

  • 字节跳动有限公司

Dates

Publication Date
20260508
Application Date
20241004
Priority Date
20231005

Claims (19)

  1. 1. A method for processing media data, comprising: Determining that a plurality of neural network post-processing filters (NNPF) or neural network post-processing filter banks (NNPFG) defined in one or more Supplemental Enhancement Information (SEI) messages share the same NNPF purpose, and Conversion between the visual media data and the bitstream is performed based on the NNPF purpose.
  2. 2. The method of claim 1, wherein the NNPF purpose of sharing is signaled in one SEI message.
  3. 3. The method of claim 1 or 2, wherein the shared NNPF purpose is conditionally signaled.
  4. 4. A method according to any of claims 1-3, wherein all NNPF or NNPFG defined in the neural network post-processing filter bank characteristics (NNPFGC) SEI message have the same NNPF purpose when nnpfgc _grouping_type is equal to 1, or wherein nnpfgc _purose is equal to 1 is inserted when signaling the NNPF purpose.
  5. 5. The method of any of claims 1-4, wherein NNPFG defined in one SEI has a different NNPF purpose.
  6. 6. The method of any of claims 1-5, wherein a plurality NNPF of the purposes are signaled in one or more SEI's, or wherein the plurality NNPF of the purposes are conditionally signaled.
  7. 7. The method of any of claims 1-6, wherein all NNPF or NNPFG defined in the NNPFGC SEI message are allowed to have different NNPF purposes when nnpfgc _grouping_type is equal to 0, or wherein nnpfgc _purose is removed equal to 0 when signaling the NNPF purpose.
  8. 8. The method of any of claims 1-7, wherein any member of the NNPF or the NNPFG in a one-dimensional container comprising a list, vector, or array is within a range, wherein bitstream conformance requires that a member include within the range that the member has an associated candIdx that does not exceed the number of pictures in candInputPicList [ m ] minus 1.
  9. 9. The method of any of claims 1-8, wherein when nnpfgc _grouping_type defined in the NNPFGC SEI message is equal to 1, a neural network post-processing filter bank activation (NNPFGA) SEI message is used to activate the corresponding NNPFG.
  10. 10. The method of any of claims 1-9, wherein NNPFGA SEI messages having a particular value of nnpfga _target_id are not present in a current Picture Unit (PU) unless there is a NNPFGC SEI message having a nnpfgc _id equal to the particular value of nnpfga _target_id and a nnpfgc _grouping_type equal to 1 present in the current PU or in a PU within a current Codec Layer Video Sequence (CLVS) that precedes the current PU in decoding order.
  11. 11. The method of any of claims 1-10, wherein when NnpfCand contains NNPF groups of nnpfgc _grouping_type equal to 1 and the NNPF groups are activated for the current picture according to NNPFGA SEI messages, the following applies: The set candSet of candidate NNPF or NNPF groups is initially empty and then set to include (1) the NNPF of the NNPF groups that are activated for the current picture and included in NnpfCand according to a neural network post-processing filter activation (NNPFA) SEI message, and (2) the NNPF groups of the NNPF groups that are activated for the current picture and included in NnpfCand according to NNPFGA SEI message, Wherein, for each candidate NNPF or NNPF group CANDFILTER in candSet, CANDFILTER is excluded from candSet when one or more input pictures of CANDFILTER are input pictures of the NNPF or NNPF group PREVFILTER used in any previous invocation of the filtering process specified in that sub-clause of the same NnpfCand, and Wherein any NNPF or NNPF group remaining in candSet is selected to be applied to the current picture.
  12. 12. The method of any of claims 1-11, wherein each active member in NNPFGA SEI messages including NNPF or NNPFG has to generate at least one output picture.
  13. 13. The method of any of claims 1-12, wherein when PictureRateUpsamplingFlag from the i NNPF is equal to 0 and nnpfga _num_output_entries [ i ] is equal to NumInpPicsInOutputTensor derived from the i NNPF, nnpfga _output_flag [ i ] [ j ] must be equal to 1 for at least one value of j within the range of 0 to nnpfga _num_output_entries [ i ] -1, wherein the range includes 0 and nnpfga _num_output_entries [ i ] -1.
  14. 14. The method of any of claims 1-13, wherein the converting comprises encoding the visual media data into the bitstream.
  15. 15. The method of any of claims 1-13, wherein the converting comprises decoding the visual media data from the bitstream.
  16. 16. An apparatus for processing video data comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to perform the method of any of claims 1-15.
  17. 17. A non-transitory computer readable medium comprising a computer program product for use by a video codec device, the computer program product comprising computer executable instructions, wherein the computer executable instructions are stored on the non-transitory computer readable medium such that when executed by a processor cause the video codec device to perform the method according to any one of claims 1-15.
  18. 18. A non-transitory computer readable recording medium storing a bitstream of video generated by a method performed by a video processing apparatus, wherein the method comprises: Determining that a plurality of neural network post-processing filters (NNPF) or neural network post-processing filter banks (NNPFG) defined in one or more Supplemental Enhancement Information (SEI) messages share the same NNPF purpose, and A bitstream is generated based on the determination.
  19. 19. A method for storing a bitstream of video, comprising: determining that a plurality of neural network post-processing filters (NNPF) or a neural network post-processing filter bank (NNPFG) defined in one or more Supplemental Enhancement Information (SEI) messages share the same NNPF purpose; Generating a bit stream based on the determination, and The bit stream is stored in a non-transitory computer readable recording medium.

Description

Purpose and various constraints of a neural network post-processing filter bank Cross Reference to Related Applications The present application claims priority and benefit from U.S. provisional patent application No. 63/588,153, filed on 5 of 10 th 2023, which is incorporated herein by reference in its entirety. Technical Field The present disclosure relates to the generation, storage, and consumption of digital audio video media information in a file format. Background Digital video occupies the maximum bandwidth used on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, the bandwidth requirements for digital video usage may continue to increase. Disclosure of Invention A first aspect relates to a method for processing video data, comprising determining that a plurality of neural network post-processing filters (NNPF) or neural network post-processing filter banks (NNPFG) defined in one or more Supplemental Enhancement Information (SEI) messages share the same NNPF purpose, and performing a conversion between visual media data and a bitstream based on the NNPF purpose. A second aspect relates to an apparatus for processing video data, comprising a processor, and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to perform any of the preceding aspects. A third aspect relates to a non-transitory computer readable medium comprising a computer program product for use by a video codec device, the computer program product comprising computer executable instructions, wherein the computer executable instructions are stored on the non-transitory computer readable medium such that when executed by a processor cause the video codec device to perform the method according to any one of the preceding aspects. A fourth aspect relates to a non-transitory computer readable recording medium storing a bitstream of video generated by a method performed by a video processing apparatus, wherein the method includes determining that a plurality of neural network post-processing filters (NNPF) or a neural network post-processing filter bank (NNPFG) defined in one or more Supplemental Enhancement Information (SEI) messages share a same NNPF purpose, and generating the bitstream based on the determination. A fifth aspect relates to a method for storing a bitstream of video, comprising determining that a plurality of neural network post-processing filters (NNPF) or neural network post-processing filter banks (NNPFG) defined in one or more Supplemental Enhancement Information (SEI) messages share the same NNPF purpose, generating a bitstream based on the determination, and storing the bitstream in a non-transitory computer-readable recording medium. A sixth aspect relates to a method, apparatus or system described in the present disclosure. For clarity purposes, any one of the foregoing embodiments may be combined with any one or more of the other foregoing embodiments to create new embodiments within the scope of the present disclosure. These and other features will be more fully understood from the following detailed description and claims, taken in conjunction with the accompanying drawings. Drawings For a more complete understanding of the present disclosure, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts. Fig. 1 is a block diagram illustrating an example video processing system. Fig. 2 is a block diagram of an example video processing apparatus. Fig. 3 is a flow chart of an example method of video processing. Fig. 4 is a block diagram illustrating an example video codec system. Fig. 5 is a block diagram illustrating an example encoder. Fig. 6 is a block diagram illustrating an example decoder. Fig. 7 is a schematic diagram of an example encoder. Detailed Description It should be understood at the outset that although an illustrative implementation of one or more embodiments are provided below, the disclosed systems and/or methods may be implemented using any number of techniques, whether currently known or in existence of development. The disclosure should in no way be limited to the illustrative implementations, drawings, and embodiments shown below, including the exemplary designs and implementations shown and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents. The section headings are used in this disclosure for ease of understanding, and are not intended to limit the applicability of the techniques and embodiments disclosed in each section to that section only. Furthermore, the H.266 term is used in some descriptions merely for ease of understanding and is not intended to limit the scope of the disclosed embodiments. Thus, the embodiments descr