CN-121986492-A - Method for controlling video coding and decoding tool based on neural network according to decoder capability
Abstract
The present embodiment discloses a method of controlling a video codec tool based on a neural network according to the allowable computation power of a decoder side. In the present embodiment, the video decoding apparatus decodes tool activation information indicating whether to activate a neural network-based codec tool, and determines whether to reserve an appropriate amount of calculation required to perform an operation related to the neural network-based codec tool in terms of decoding. The video decoding device configures a tool flag indicating whether to use a neural network-based codec tool based on the tool activation information and whether to reserve an appropriate calculation amount. If the tool flag is true, the video decoding apparatus decodes parameters related to the neural network-based codec tool, and applies the neural network-based codec tool to the current frame based on the decoded parameters.
Inventors
- JIANG ZHIYUAN
- LI DINGGUI
- Shen Zhoujuan
- XU ZHEN
- Cui Zhene
- PU SHENGYU
Assignees
- 现代自动车株式会社
- 起亚株式会社
- 梨花女子大学校产学协力团
Dates
- Publication Date
- 20260505
- Application Date
- 20240731
- Priority Date
- 20230825
Claims (18)
- 1. A method performed by a video decoding device of reconstructing a current frame, the method comprising: Decoding tool activation information indicating whether a neural network-based codec tool is activated from a bitstream, the tool activation information indicating that the neural network-based codec tool is used during an encoding process of a current frame by a video encoding apparatus; determining whether there is an appropriate amount of computation in terms of decoding required to perform operations related to the neural network-based codec tool; Setting a tool flag indicating whether to use a neural network-based codec tool based on the tool activation information and whether there is an appropriate calculation amount, and The tool flag is confirmed and the tool flag is checked, Wherein the method further comprises: When the tool flag is true, the tool flag is set, Decoding parameters related to a neural network based codec tool from an adaptive parameter set within a bitstream, and The neural network-based codec tool is applied to the current frame based on the decoded parameters.
- 2. The method of claim 1, wherein setting the tool flag comprises setting the tool flag to true when the tool activation information indicates that a neural network-based codec tool is activated and has an appropriate amount of computation in terms of decoding.
- 3. The method of claim 1, wherein setting the tool flag includes setting the tool flag to false when the tool activation information indicates that a neural network-based codec tool is not activated or that there is no suitable amount of computation in decoding.
- 4. The method of claim 1, wherein when a plurality of neural network-based codec tools are used, decoding tool activation information comprises: Decoding tool activation information, the tool activation information being a combination of flags indicating whether each neural network-based code decoding tool is activated; determining whether there is an appropriate calculation amount includes: determining whether there is a computational effort in terms of decoding suitable for performing operations associated with each neural network-based codec tool, and Setting the tool flag includes: Based on the flag indicating whether to activate the neural network-based codec tool and whether there is an appropriate amount of computation, a tool flag indicating whether to use each of the neural network-based codec tools is set.
- 5. The method of claim 1, wherein determining whether there is an appropriate calculation amount comprises: decoding the index indicating the maximum calculated amount, the total number of parameters, the maximum memory usage, and Based on the total number of parameters, the maximum memory usage, and an index representing the maximum calculation amount, it is determined whether there is a calculation amount suitable for performing an operation related to a neural network-based codec tool in terms of decoding.
- 6. The method of claim 1, wherein determining whether there is an appropriate calculation amount comprises: decoding the index representing the maximum calculation amount from the total number of parameters of the bit stream; obtaining weights; calculating a total calculated amount index by weighting the total number of parameters and the index representing the maximum calculated amount based on the weights, and Based on the total calculation amount index, it is determined whether there is a calculation amount suitable for performing an operation related to a neural network-based codec tool in terms of decoding.
- 7. The method of claim 1, wherein determining whether there is an appropriate amount of computation comprises determining whether there is an amount of computation in terms of decoding suitable for performing operations related to a neural network-based codec tool based on TemporalID related to a level of a picture to which the current frame refers.
- 8. The method of claim 1, wherein decoding tool activation information comprises: The tool activation information is decoded from a sequence parameter set or picture parameter set within the bitstream.
- 9. The method according to claim 1, Wherein the decoding parameters include decoding parameters from a neural adaptive parameter set within the bitstream.
- 10. The method of claim 1, wherein decoding the parameters includes decoding a set of parameters constituting the neural network in-loop filter, a structure of the neural network in-loop filter, information about a core used in the neural network in-loop filter, and an index indicating a structure of the pre-defined neural network in-loop filter when the neural network based codec is the neural network in-loop filter.
- 11. The method of claim 10, wherein decoding the parameters includes decoding information specifying an input type of the in-loop filter of the neural network when the preset filter set is applied as the in-loop filter of the neural network.
- 12. A method performed by a video encoding device of encoding a current frame, the method comprising: Reconstructing the current frame using the decoding path; obtaining parameters related to a neural network-based codec tool; Applying a neural network based codec tool to the reconstructed current frame based on the parameters to generate an improved current frame; Determining tool activation information indicating whether to activate a neural network-based codec tool based on the reconstructed current frame and the modified current frame; encoding and including tool activation information in a bitstream, and The tool activation information is validated and the tool activation information is displayed, Wherein the method further comprises: When the tool activation information indicates that a neural network based codec tool is activated, the parameters are encoded and included in an adaptive parameter set within the bitstream.
- 13. The method of claim 12, wherein determining tool activation information comprises determining tool activation information that is a combination of flags indicating whether each neural network-based codec tool is activated when a plurality of neural network-based codec tools are used.
- 14. The method of claim 12, further comprising: When the tool activation information indicates activation of a neural network based codec tool, Deriving a total number of parameters required to perform an operation related to a neural network-based codec tool, a maximum memory usage, and an index representing the maximum computation amount, and The total number of parameters, the maximum memory usage and the index representing the maximum calculation are encoded and included in the bitstream.
- 15. The method of claim 12, further comprising: When the tool activation information indicates activation of a neural network based codec tool, Deriving a total number of parameters required to perform an operation related to the neural network-based codec tool, and an index representing a maximum calculation amount; determining weights required for calculating the total calculated amount index based on the total number of parameters and the index representing the maximum calculated amount, and The total number of parameters, the index representing the maximum amount of computation, and the weights are encoded and included in the bitstream.
- 16. The method according to claim 12, Wherein encoding the tool activation information includes including the tool activation information in a sequence parameter set or a picture parameter set within the bitstream.
- 17. The method according to claim 12, Wherein encoding the parameters includes including the parameters in a neural adaptive parameter set within a bitstream.
- 18. A method of providing video data to a video decoding device, the method comprising: encoding video data into a bitstream, and The bit stream is sent to a video decoding device, Wherein encoding video data comprises: Reconstructing the current frame using the decoding path; obtaining parameters related to a neural network-based codec tool; Applying a neural network based codec tool to the reconstructed current frame based on the parameters to generate an improved current frame; Determining tool activation information indicating whether to activate a neural network-based codec tool based on the reconstructed current frame and the modified current frame; encoding and including tool activation information in a bitstream, and The tool activation information is validated and the tool activation information is displayed, Wherein the method further comprises: When the tool activation information indicates that a neural network based codec tool is activated, the parameters are encoded and included in an adaptive parameter set within the bitstream.
Description
Method for controlling video coding and decoding tool based on neural network according to decoder capability Technical Field The present invention relates to a method of controlling a video codec tool based on a neural network according to an allowable computing power at a decoder side. Background The following description merely provides background information related to the present embodiment and does not constitute prior art. Since video data has a large data amount compared with voice data or still image data, a large amount of hardware resources (including a memory) are required to store or transmit video data without performing compression processing. Thus, when storing or transmitting video data, an encoder compresses the video data and stores or transmits the compressed video data, a decoder receives the compressed video data, decompresses the received compressed video data, and plays the decompressed video data. These Video compression techniques include h.264/Advanced Video Coding (AVC), high Efficiency Video Coding (HEVC), and multi-function Video Coding (VERSATILE VIDEO CODING, VVC) that improves Coding efficiency by about 30% or more over HEVC. However, as the size, resolution, and frame rate of images are gradually increased, the amount of data to be encoded is also increasing, and thus a new compression technique that is more efficient in encoding and better in image quality enhancement than the existing compression technique is required. Recently, image processing techniques based on deep learning are applied to existing coding element techniques. The coding efficiency can be improved by applying the image processing technique based on the depth learning to compression techniques such as inter-frame prediction, intra-frame prediction, in-loop filter, and transformation in the existing coding techniques. Representative application examples include inter prediction based on virtual reference frames generated from a deep learning model and in-loop filters based on an image reconstruction model. As described above, the neural network-based video codec tools used in the video codec technology require a significant amount of computation on the decoder side. Accordingly, there is a need to consider a method of effectively utilizing a neural network-based video codec tool to enhance video coding efficiency and improve video quality. Disclosure of Invention Technical problem An object of the present invention is to provide a video codec method and apparatus that scalability-controls a neural network-based video codec tool according to an allowable computation capability at a decoder side by transmitting and reconstructing flag and parameter information related to the neural network-based video codec tool based on a high-level syntax. Technical proposal According to an embodiment of the present invention, a method of reconstructing a current frame performed by a video decoding apparatus may include decoding, from a bitstream, tool activation information indicating whether a neural network-based codec tool is activated, the tool activation information indicating that the neural network-based codec tool is used during an encoding process of the current frame by the video encoding apparatus, determining whether there is an appropriate amount of computation required to perform an operation related to the neural network-based codec tool in terms of decoding, setting a tool flag indicating whether to use the neural network-based codec tool based on the tool activation information and whether there is the appropriate amount of computation, and confirming the tool flag. The method may further include decoding parameters related to the neural network based codec tool from the adaptive parameter set within the bitstream when the tool flag is true, and applying the neural network based codec tool to the current frame based on the decoded parameters. According to another embodiment of the present invention, a method of encoding a current frame performed by a video encoding apparatus may include reconstructing the current frame using a decoding path, obtaining parameters related to a neural network-based codec tool, applying the neural network-based codec tool to the reconstructed current frame based on the parameters to generate an improved current frame, determining tool activation information indicating whether to activate the neural network-based codec tool based on the reconstructed current frame and the improved current frame, encoding and including the tool activation information in a bitstream, and confirming the tool activation information. The method may further include encoding and including the parameters in an adaptive parameter set within the bitstream when the tool activation information indicates activation of a neural network based codec tool. According to another embodiment of the present invention, a method of providing video data to a video decoding apparatus may include encoding the video dat