US-20260129247-A1 - METHOD, APPARATUS, AND MEDIUM FOR VIDEO PROCESSING
Abstract
Embodiments of the present disclosure provide a solution for video processing. A method for video processing is proposed. The method comprises: performing a conversion between a video and a bitstream of the video, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and a generation of at least one NNPF output picture is performed no more than once between any particular pair of consecutive input pictures.
Inventors
- Jizheng Xu
- Ye-Kui Wang
Assignees
- BYTEDANCE INC.
Dates
- Publication Date
- 20260507
- Application Date
- 20251231
Claims (20)
- 1 . A method for video processing, comprising: performing a conversion between a video and a bitstream of the video, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and a generation of at least one NNPF output picture is performed no more than once between any particular pair of consecutive input pictures.
- 2 . The method of claim 1 , wherein for any particular pair of pictures consecutive in an output order in a list of decoded pictures in the output order resulted from decoding the bitstream, if there are one or more pictures in a list of NNPF output pictures that are between the particular pair of pictures in the output order, the one or more pictures are among pictures that are output by applying a particular NNPF with a purpose comprising picture rate upsampling when a first picture in the list of decoded pictures is the current picture.
- 3 . The method of claim 2 , wherein at least one of the following does not output any picture between the particular pair of pictures in the output order: an application of any NNPF that is different from the particular NNPF and used in a filtering process for one picture when the first picture is the current picture, or an application of any NNPF that is used in a filtering process for one picture when a second picture in the list of decoded pictures is the current picture, the second picture being different from the first picture.
- 4 . The method of claim 2 , wherein the list of decoded pictures comprises at least one decoded picture or at least one cropped decoded picture of the video.
- 5 . The method of claim 1 , wherein if a purpose of an NNPF comprises picture rate upsampling and the NNPF is allowed to be used for post-processing filtering for the current picture and all subsequent pictures of a current layer in an output order until a condition is met, one or more interpolated pictures are generated only between one pair of consecutive input pictures for the NNPF, and wherein the current layer is a layer comprising the current picture.
- 6 . The method of claim 5 , wherein an indication pictureRateUpsamplingFlag indicates whether the purpose of the NNPF comprises picture rate upsampling or not, or an indication nnpfa_persistence_flag indicates whether the NNPF is allowed to be used for post-processing filtering for the current picture and all subsequent pictures of the current layer in the output order until the condition is met.
- 7 . The method of claim 1 , wherein a generation of at least one new picture is performed no more than once between any two consecutive pictures in an output order in the bitstream, or wherein a generation of at least one new picture is performed no more than once between any two consecutive pictures in an output order in a coded layer video sequence (CLVS) in the bitstream, or wherein for an NNPF with a purpose comprising picture rate upsampling, if the NNPF is activated for more than one picture in the bitstream, for any pair of consecutive pictures in an output order in the bitstream, at most one of the activations of the NNPF is allowed to interpolate pictures between that pair of consecutive pictures.
- 8 . The method of claim 1 , wherein an output time instance of each of at least one generated picture during picture rate upsampling is specified, and the output time instance of each of the at least one generated picture is determined based on that the at least one generated picture between two consecutive input pictures during picture rate upsampling is uniformly located between the two consecutive input pictures.
- 9 . The method of claim 1 , wherein the current picture comprises a decoded picture or a cropped decoded picture of the video.
- 10 . The method of claim 1 , wherein the conversion includes encoding the video into the bitstream.
- 11 . The method of claim 1 , wherein the conversion includes decoding the video from the bitstream.
- 12 . An apparatus for video processing comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to perform operations comprising: performing a conversion between a video and a bitstream of the video, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and a generation of at least one NNPF output picture is performed no more than once between any particular pair of consecutive input pictures.
- 13 . The apparatus of claim 12 , wherein for any particular pair of pictures consecutive in an output order in a list of decoded pictures in the output order resulted from decoding the bitstream, if there are one or more pictures in a list of NNPF output pictures that are between the particular pair of pictures in the output order, the one or more pictures are among pictures that are output by applying a particular NNPF with a purpose comprising picture rate upsampling when a first picture in the list of decoded pictures is the current picture.
- 14 . The apparatus of claim 13 , wherein at least one of the following does not output any picture between the particular pair of pictures in the output order: an application of any NNPF that is different from the particular NNPF and used in a filtering process for one picture when the first picture is the current picture, or an application of any NNPF that is used in a filtering process for one picture when a second picture in the list of decoded pictures is the current picture, the second picture being different from the first picture.
- 15 . A non-transitory computer-readable storage medium storing instructions that cause a processor to perform operations comprising: performing a conversion between a video and a bitstream of the video, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and a generation of at least one NNPF output picture is performed no more than once between any particular pair of consecutive input pictures.
- 16 . The non-transitory computer-readable storage medium of claim 15 , wherein for any particular pair of pictures consecutive in an output order in a list of decoded pictures in the output order resulted from decoding the bitstream, if there are one or more pictures in a list of NNPF output pictures that are between the particular pair of pictures in the output order, the one or more pictures are among pictures that are output by applying a particular NNPF with a purpose comprising picture rate upsampling when a first picture in the list of decoded pictures is the current picture.
- 17 . The non-transitory computer-readable storage medium of claim 16 , wherein at least one of the following does not output any picture between the particular pair of pictures in the output order: an application of any NNPF that is different from the particular NNPF and used in a filtering process for one picture when the first picture is the current picture, or an application of any NNPF that is used in a filtering process for one picture when a second picture in the list of decoded pictures is the current picture, the second picture being different from the first picture.
- 18 . A non-transitory computer-readable recording medium storing a bitstream of a video which is generated by a method performed by an apparatus for video processing, wherein the method comprises: performing a conversion between a video and a bitstream of the video, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and a generation of at least one NNPF output picture is performed no more than once between any particular pair of consecutive input pictures.
- 19 . The non-transitory computer-readable recording medium of claim 18 , wherein for any particular pair of pictures consecutive in an output order in a list of decoded pictures in the output order resulted from decoding the bitstream, if there are one or more pictures in a list of NNPF output pictures that are between the particular pair of pictures in the output order, the one or more pictures are among pictures that are output by applying a particular NNPF with a purpose comprising picture rate upsampling when a first picture in the list of decoded pictures is the current picture.
- 20 . The non-transitory computer-readable recording medium of claim 19 , wherein at least one of the following does not output any picture between the particular pair of pictures in the output order: an application of any NNPF that is different from the particular NNPF and used in a filtering process for one picture when the first picture is the current picture, or an application of any NNPF that is used in a filtering process for one picture when a second picture in the list of decoded pictures is the current picture, the second picture being different from the first picture.
Description
CROSS REFERENCE This application is a continuation of International Application No. PCT/US2024/036589, filed on Jul. 2, 2024, which claims the benefit of U.S. Provisional Application No. 63/511,818, filed on Jul. 3, 2023, and U.S. Provisional Application No. 63/513,456, filed on Jul. 13, 2023. The entire contents of these applications are hereby incorporated by reference in their entireties. FIELDS Embodiments of the present disclosure relates generally to video processing techniques, and more particularly, to a neural-network post-processing filter (NNPF). BACKGROUND In nowadays, digital video capabilities are being applied in various aspects of peoples' lives. Multiple types of video compression technologies, such as MPEG-2, MPEG-4, ITU-TH.263, ITU-TH.264/MPEG-4 Part 10 Advanced Video Coding (AVC), ITU-TH.265 high efficiency video coding (HEVC) standard, versatile video coding (VVC) standard, have been proposed for video encoding/decoding. However, the functionality of video coding techniques is generally expected to be further improved. SUMMARY Embodiments of the present disclosure providc a solution for video processing. In a first aspect, a method for video processing is proposed. The method comprises: performing a conversion between a video and a bitstream of the video, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and a generation of at least one NNPF output picture is performed no more than once between any particular pair of consecutive input pictures. Based on the method in accordance with the first aspect of the present disclosure, it is specified that a generation of at least one NNPF output picture is performed no more than once between any particular pair of consecutive input pictures. Compared with the conventional solution lacking such a constraint, the proposed method can advantageously avoid multiple output pictures in one NNPF inference instance. Thereby, a proper functionality of NNPF can be ensured. In a second aspect, an apparatus for video processing is proposed. The apparatus comprises a processor and a non-transitory memory with instructions thereon. The instructions upon execution by the processor, cause the processor to perform a method in accordance with the first aspect of the present disclosure. In a third aspect, a non-transitory computer-readable storage medium is proposed. The non-transitory computer-readable storage medium stores instructions that cause a processor to perform a method in accordance with the first aspect of the present disclosure. In a fourth aspect, another non-transitory computer-readable recording medium is proposed. The non-transitory computer-readable recording medium stores a bitstream of a video which is generated by a method performed by an apparatus for video processing. The method comprises: performing a conversion between a video and a bitstream of the video, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and a generation of at least one NNPF output picture is performed no more than once between any particular pair of consecutive input pictures. In a fifth aspect, a method for storing a bitstream of a video is proposed. The method comprises: generating the bitstream from the video; and storing the bitstream in a non-transitory computer-readable recording medium, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and a generation of at least one NNPF output picture is performed no more than once between any particular pair of consecutive input pictures. This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. BRIEF DESCRIPTION OF THE DRAWINGS Through the following detailed description with reference to the accompanying drawings, the above and other objectives, features, and advantages of example embodiments of the present disclosure will become more apparent. In the example embodiments of the present disclosure, the same reference numerals usually refer to the same components. FIG. 1 illustrates a block diagram that illustrates an example video coding system, in accordance with some embodiments of the present disclosure; FIG. 2 illustrates a block diagram that illustrates a first example video encoder, in accordance with some embodiments of the present disclosure; FIG. 3 illustrates a block diagram that illustrates an example video decoder, in accordance with some embodiments of the present disclosure; FIG. 4 illustrates an illustration of