US-20260127777-A1 - METHOD, APPARATUS, AND MEDIUM FOR VIDEO PROCESSING

US20260127777A1US 20260127777 A1US20260127777 A1US 20260127777A1US-20260127777-A1

Abstract

Embodiments of the present disclosure provide a solution for video processing. A method for video processing is proposed. The method comprises: performing a conversion between a video and a bitstream of the video, wherein all of a plurality of messages for neural-network post-filter characteristics (NNPFC) with a same value of an identifying number for identifying an NNPF within a coded layer video sequence (CLVS) in the bitstream indicate a same purpose of the NNPF.

Inventors

Wei Jia
Ye-Kui Wang
Jizheng Xu

Assignees

BYTEDANCE INC.

Dates

Publication Date: 20260507
Application Date: 20251231

Claims (20)

1 . A method for video processing, comprising: performing a conversion between a video and a bitstream of the video, wherein all of a plurality of messages for neural-network post-filter characteristics (NNPFC) with a same value of an identifying number for identifying an NNPF within a coded layer video sequence (CLVS) in the bitstream indicate a same purpose of the NNPF.
2 . The method of claim 1 , wherein each of the plurality of messages is a supplemental enhancement information (SEI) message.
3 . The method of claim 1 , wherein the purpose of the NNPF is indicated by a first indication in each of the plurality of messages, and values of first indications in all of the plurality of messages are same.
4 . The method of claim 3 , wherein the first indication comprises a syntax element nnpfc_purpose.
5 . The method of claim 3 , wherein the identifying number is indicated by a second indication in each of the plurality of messages.
6 . The method of claim 5 , wherein the second indication comprises a syntax element nnpfc_id.
7 . The method of claim 1 , wherein the conversion includes encoding the video into the bitstream.
8 . The method of claim 1 , wherein the conversion includes decoding the video from the bitstream.
9 . A method for video processing, comprising: performing a conversion between a video and a bitstream of the video, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and if the number of the at least one input picture is greater than 1, each of the at least one input picture is derived.
10 . The method of claim 9 , wherein if the number of the at least one input picture for the NNPF is greater than 1, for the j-th inference in a loop of NNPF inferences for j in a range of 0 to N−1, a process for deriving the i-th input picture and an indication representing a presence of the i-th input picture is performed for each value of i in a range of j+1 to N−1, inclusive, in increasing order of i, and wherein N represents the number of the at least one input picture for the NNPF.
11 . The method of claim 10 , wherein the indication representing the presence of the i-th input picture is represented as inputPresentFlag[i].
12 . The method of claim 9 , wherein if a width of a source picture is equal to a width of a cropped picture and a height of the source picture is equal to a height of the cropped picture, a resampled picture is set to be the same as the source picture.
13 . The method of claim 12 , wherein the width of the source picture is represented as sourceWidth, or the width of the cropped picture is represented as CroppedWidth, or the height of the source picture is represented as sourceHeight, or the height of the cropped picture is represented as CroppedHeight, or the resampled picture is represented as resampledPic, or the source picture is represented as sourcePic.
14 . The method of claim 9 , wherein if a purpose of the NNPF comprises picture rate upsampling, the number of interpolated pictures generated by the NNPF between the (k−1)-th and the k-th input pictures for the NNPF is equal to 0, and wherein k is an integer.
15 . The method of claim 9 , wherein the current picture comprises a decoded picture or a cropped decoded picture of the video.
16 . The method of claim 9 , wherein the conversion includes encoding the video into the bitstream.
17 . The method of claim 9 , wherein the conversion includes decoding the video from the bitstream.
18 . An apparatus for video processing comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to perform operations comprising: performing a conversion between a video and a bitstream of the video, wherein all of a plurality of messages for neural-network post-filter characteristics (NNPFC) with a same value of an identifying number for identifying an NNPF within a coded layer video sequence (CLVS) in the bitstream indicate a same purpose of the NNPF.
19 . A non-transitory computer-readable storage medium storing instructions that cause a processor to perform operations comprising: performing a conversion between a video and a bitstream of the video, wherein all of a plurality of messages for neural-network post-filter characteristics (NNPFC) with a same value of an identifying number for identifying an NNPF within a coded layer video sequence (CLVS) in the bitstream indicate a same purpose of the NNPF.
20 . A non-transitory computer-readable recording medium storing a bitstream of a video which is generated by a method performed by an apparatus for video processing, wherein the method comprises: performing a conversion between the video and the bitstream, wherein all of a plurality of messages for neural-network post-filter characteristics (NNPFC) with a same value of an identifying number for identifying an NNPF within a coded layer video sequence (CLVS) in the bitstream indicate a same purpose of the NNPF.

Description

CROSS REFERENCE This application is a continuation of International Application No. PCT/US2024/036586, filed on Jul. 2, 2024, which claims the benefit of U.S. Provisional Application No. 63/511,814, filed on Jul. 3, 2023. The entire contents of these applications are hereby incorporated by reference in their entireties. FIELDS Embodiments of the present disclosure relates generally to video processing techniques, and more particularly, to a neural-network post-processing filter (NNPF). BACKGROUND In nowadays, digital video capabilities are being applied in various aspects of peoples' lives. Multiple types of video compression technologies, such as MPEG-2, MPEG-4, ITU-TH.263, ITU-TH.264/MPEG-4 Part 10 Advanced Video Coding (AVC), ITU-TH.265 high efficiency video coding (HEVC) standard, versatile video coding (VVC) standard, have been proposed for video encoding/decoding. However, the functionality of video coding techniques is generally expected to be further improved. SUMMARY Embodiments of the present disclosure provide a solution for video processing. In a first aspect, a method for video processing is proposed. The method comprises: performing a conversion between a video and a bitstream of the video, wherein all of a plurality of messages for neural-network post-filter characteristics (NNPFC) with a same value of an identifying number for identifying an NNPF within a coded layer video sequence (CLVS) in the bitstream indicate a same purpose of the NNPF. Based on the method in accordance with the first aspect of the present disclosure, all of a plurality of messages for neural-network post-filter characteristics (NNPFC) with a same value of an identifying number for identifying an NNPF within a coded layer video sequence (CLVS) in the bitstream are required to indicate a same purpose of the NNPF. Compared with the conventional solution lacking such a constraint, the proposed method can advantageously avoid unintentional mismatch and thus ensure a proper functionality of NNPF. In a second aspect, another method for video processing is proposed. The method comprises: performing a conversion between a video and a bitstream of the video, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and if the number of the at least one input picture is greater than 1, each of the at least one input picture is derived. Based on the method in accordance with the second aspect of the present disclosure, it is required that each of at least one input picture for NNPF is derived especially in a case where the at least one input picture. Compared with the conventional solution lacking such a constraint, the proposed method can advantageously guarantee that all of input pictures needed for NNPF are derived and thus ensure a proper functionality of NNPF. In a third aspect, an apparatus for video processing is proposed. The apparatus comprises a processor and a non-transitory memory with instructions thereon. The instructions upon execution by the processor, cause the processor to perform a method in accordance with the first, or second aspect of the present disclosure. In a fourth aspect, a non-transitory computer-readable storage medium is proposed. The non-transitory computer-readable storage medium stores instructions that cause a processor to perform a method in accordance with the first, or second aspect of the present disclosure. In a fifth aspect, another non-transitory computer-readable recording medium is proposed. The non-transitory computer-readable recording medium stores a bitstream of a video which is generated by a method performed by an apparatus for video processing. The method comprises: performing a conversion between the video and the bitstream, wherein all of a plurality of messages for neural-network post-filter characteristics (NNPFC) with a same value of an identifying number for identifying an NNPF within a coded layer video sequence (CLVS) in the bitstream indicate a same purpose of the NNPF. In a sixth aspect, another non-transitory computer-readable recording medium is proposed. The non-transitory computer-readable recording medium stores a bitstream of a video which is generated by a method performed by an apparatus for video processing. The method comprises: performing a conversion between a video and a bitstream of the video, wherein a neural-network post-processing filter (NNPF) is applied on a current picture associated with the video based on at least one input picture for the NNPF, and if the number of the at least one input picture is greater than 1, each of the at least one input picture is derived. In a seventh aspect, a method for storing a bitstream of a video is proposed. The method comprises: generating the bitstream from the video; and storing the bitstream in a non-transitory computer-readable recording medium, wherein all of a plurality of messages for neural-network post-