CN-121986477-A - Optimizing signaling for preprocessing

CN121986477ACN 121986477 ACN121986477 ACN 121986477ACN-121986477-A

Abstract

A mechanism for processing video data is disclosed. The mechanism includes determining that pre-processing information including properties or parameters of an optimized filter is included in the bitstream. Conversion is performed between the visual media data and the bitstream based on the preprocessing information.

Inventors

JIA WEI
LI YUE
WANG YEKUI
ZHANG KAI
ZHANG LI

Assignees

字节跳动有限公司

Dates

Publication Date: 20260505
Application Date: 20241004
Priority Date: 20231005

Claims (20)

1. A method for processing media data, comprising: Determining that pre-processing information including properties or parameters of an optimized filter is included in a bitstream, and Conversion between the visual media data and the bitstream is performed based on the preprocessing information.
2. The method of claim 1, wherein the filter type is signaled when the optimization type indicates spatial sub-sampling.
3. The method of any of claims 1-2, wherein the bitstream contains one or more of an indication of a nearest neighbor interpolation type, a bilinear interpolation type, a bicubic interpolation type, a Lanczos interpolation type, a bit-exact bilinear interpolation type, a bit-exact nearest neighbor interpolation type, a mask type of an interpolation code, a neural network filter of an ISO/IEC 15938-17 format type, and a neural network filter of a Uniform Resource Identifier (URI) type for preprocessing optimization.
4. A method according to any of claims 1-3, wherein the filter type is signaled when the optimization type indicates time domain sub-sampling.
5. The method of any of claims 1-4, wherein the bitstream contains an indication of one or more of an original frame rate and a sub-sampling frame rate, the original frame rate and a sub-sampling scaling factor, and the sub-sampling frame rate and the sub-sampling scaling factor for preprocessing optimization.
6. The method of any of claims 1-5, wherein the filter type is signaled when the optimization type indicates object-based optimization.
7. The method of any of claims 1-6, wherein the bitstream contains an indication of one or more of a filter type to look only once (YOLO) type of neural network filter, a faster region-based convolutional neural network (R-CNN) type of neural network filter, an ISO/IEC 15938-17 format type of neural network filter, and a URI type of neural network filter for preprocessing optimization.
8. The method of any of claims 1-7, wherein the filter type is signaled when the optimization type indicates a time domain quality or a spatial domain quality.
9. The method of any of claims 1-8, wherein the bitstream contains an indication of one or more of the following filter types for preprocessing optimization, an ISO/IEC 15938-17 format type neural network filter, and a URI type neural network filter.
10. The method of any of claims 1-9, wherein the bitstream contains an indication of an amount of information used by an optimization filter to pre-process optimization.
11. The method of any of claims 1-10, wherein the bitstream contains an indication of a specific number of filters per type of optimization applied.
12. The method of any of claims 1-11, wherein the bitstream contains an indication of a number of filters applied with an optimization type, and wherein the optimization type is indicated as object-based optimization, time-domain sub-sampling optimization, spatial sub-sampling optimization, time-domain quality optimization, spatial quality optimization, or a combination thereof.
13. The method of any of claims 1-12, wherein the pre-processing information is included in a syntax element encoded according to binarization as a flag, fixed length code, EG (x) code, unary code, truncated binary code, signed element, unsigned element, context model, bypass codec, conditional codec, or a combination thereof.
14. The method according to any of claims 1-13, wherein the pre-processing information is signaled only when the corresponding function is applicable or only when the width and/or height of the block meets a condition.
15. The method according to any of claims 1-14, wherein the pre-processing information is signaled at a block level, a sequence level, a picture group level, a picture level, a slice level, or a slice group level, or wherein the pre-processing information is signaled in a Coding Tree Unit (CTU), a Coding Unit (CU), a Transform Unit (TU), a Picture Unit (PU), a Coding Tree Block (CTB), a coding decoding block (CB), a Transform Block (TB), a Prediction Block (PB), a sequence header, a picture header, a Sequence Parameter Set (SPS), a Video Parameter Set (VPS), a Dependency Parameter Set (DPS), decoding Capability Information (DCI), a Picture Parameter Set (PPS), an Adaptive Parameter Set (APS), a slice header, a slice group header, or a Supplemental Enhancement Information (SEI) message.
16. The method of any of claims 1-15, wherein the converting comprises encoding the visual media data into the bitstream.
17. The method of any of claims 1-15, wherein the converting comprises decoding the visual media data from the bitstream.
18. An apparatus for processing video data comprising a processor and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to perform the method of any of claims 1-17.
19. A non-transitory computer readable medium comprising a computer program product for use by a video codec device, the computer program product comprising computer executable instructions, wherein the computer executable instructions are stored on the non-transitory computer readable medium such that when executed by a processor cause the video codec device to perform the method according to any one of claims 1-17.
20. A non-transitory computer readable recording medium storing a bitstream of video generated by a method performed by a video processing apparatus, wherein the method comprises: Determining that pre-processing information including properties or parameters of an optimized filter is included in a bitstream, and A bitstream is generated based on the determination.

Description

Optimizing signaling for preprocessing Cross Reference to Related Applications The present application claims priority and benefit from U.S. provisional patent application No. 63/588,225, filed on 5 of 10 th 2023, which is incorporated herein by reference in its entirety. Technical Field The present disclosure relates to the generation, storage, and consumption of digital audio video media information in a file format. Background Digital video occupies the maximum bandwidth used on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, the bandwidth requirements for digital video usage may continue to increase. Disclosure of Invention A first aspect relates to a method for processing video data comprising determining that pre-processing information comprising properties or parameters of an optimized filter is included in a bitstream, and performing a conversion between visual media data and the bitstream based on the pre-processing information. A second aspect relates to an apparatus for processing video data, comprising a processor, and a non-transitory memory having instructions thereon, wherein the instructions, when executed by the processor, cause the processor to perform any of the preceding aspects. A third aspect relates to a non-transitory computer readable medium comprising a computer program product for use by a video codec device, the computer program product comprising computer executable instructions, wherein the computer executable instructions are stored on the non-transitory computer readable medium such that when executed by a processor cause the video codec device to perform the method according to any one of the preceding aspects. A fourth aspect relates to a non-transitory computer readable recording medium storing a bitstream of a video generated by a method performed by a video processing apparatus, wherein the method includes determining that preprocessing information including an attribute or parameter of an optimization filter is included in the bitstream, and generating the bitstream based on the determination. A fifth aspect relates to a method for storing a bitstream of a video, comprising determining that pre-processing information including properties or parameters of an optimization filter is included in the bitstream, generating the bitstream based on the determination, and storing the bitstream in a non-transitory computer-readable recording medium. A sixth aspect relates to a method, apparatus or system described in the present disclosure. For clarity purposes, any one of the foregoing embodiments may be combined with any one or more of the other foregoing embodiments to create new embodiments within the scope of the present disclosure. These and other features will be more fully understood from the following detailed description and claims, taken in conjunction with the accompanying drawings. Drawings For a more complete understanding of the present disclosure, reference is now made to the following brief description, taken in connection with the accompanying drawings and detailed description, wherein like reference numerals represent like parts. Fig. 1 is a block diagram illustrating an example video processing system. Fig. 2 is a block diagram of an example video processing apparatus. Fig. 3 is a flow chart of an example method of video processing. Fig. 4 is a block diagram illustrating an example video codec system. Fig. 5 is a block diagram illustrating an example encoder. Fig. 6 is a block diagram illustrating an example decoder. Fig. 7 is a schematic diagram of an example encoder. Detailed Description It should be understood at the outset that although an illustrative implementation of one or more embodiments are provided below, the disclosed systems and/or methods may be implemented using any number of techniques, whether currently known or in existence of development. The disclosure should in no way be limited to the illustrative implementations, drawings, and embodiments shown below, including the exemplary designs and implementations shown and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents. The section headings are used in this disclosure for ease of understanding, and are not intended to limit the applicability of the techniques and embodiments disclosed in each section to that section only. Furthermore, the H.266 term is used in some descriptions merely for ease of understanding and is not intended to limit the scope of the disclosed embodiments. Thus, the embodiments described herein are also applicable to other video codec protocols and designs. In the present disclosure, the edit changes are shown in text by bold italics indicating cancelled text and bold indicating added text, relative to a multi-function video codec (VVC) specification. 1. Preliminary discussion The present disclosure relate