US-12627843-B2 - Method of performing neural network filtering for video data and device

US12627843B2US 12627843 B2US12627843 B2US 12627843B2US-12627843-B2

Abstract

A device may be configured to perform filtering based on information included in a neural network post-filter characteristics message. In one example, the neural network post-filter characteristics message includes a syntax element indicating a purpose of a post-processing filter and a syntax element specifying whether syntax elements related to a purpose, input formatting, output formatting, and complexity of the post-processing filter are present in the neural network post-filter characteristics message. The syntax element indicating a purpose may precede the syntax element specifying whether syntax elements are present.

Inventors

Sachin G. Deshpande

Assignees

SHARP KABUSHIKI KAISHA

Dates

Publication Date: 20260512
Application Date: 20240919

Claims (5)

1 . A device comprising: one or more processors configured to: receive a neural network post-filter characteristics message; parse a first syntax element in the neural network post-filter characteristics message, wherein the first syntax element indicates a purpose of a post-processing filter, and the first syntax element is an initial syntax element present in the neural network post-filter characteristics message; parse a second syntax element in the neural network post-filter characteristics message, wherein the second syntax element is an unsigned integer 0-th order Exp-Golomb-coded syntax element with a left bit first, contains an identifying number that may be used to identify a post-processing filter and the second syntax element immediately follows the first syntax element in the neural network post-filter characteristics message; and parse a third syntax element in the neural network post-filter characteristics message, wherein the third syntax element is a one bit syntax element, and the third syntax element specifies that the neural network post-filter characteristics message specifies an update to a previous neural network post-filter having a same value of the second syntax element.
2 . The device of claim 1 , wherein the device includes a video decoder.
3 . A device comprising: one or more processors configured to: signal a neural network post-filter characteristics message, wherein the neural network post-filter characteristics message includes: a first syntax element, wherein the first syntax element indicates a purpose of a post-processing filter, and the first syntax element is an initial syntax element present in the neural network post-filter characteristics message, a second syntax element, wherein the second syntax element is an unsigned integer 0-th order Exp-Golomb-coded syntax element with a left bit first, contains an identifying number that may be used to identify a post-processing filter and the second syntax element immediately follows the first syntax element in the neural network post-filter characteristics message, and a third syntax element, wherein the third syntax element is a one bit syntax element, and the third syntax element specifies that the neural network post-filter characteristics message specifies an update to a previous neural network post-filter having a same value of the second syntax element.
4 . The device of claim 3 , wherein the device includes a video encoder.
5 . A non-transitory computer-readable storage medium comprises instructions stored thereon that, when executed, to cause one or more processors of a device to: signal a bitstream, wherein the bitstream includes a neural network post-filter characteristics message, wherein the neural network post-filter characteristics message includes: a first syntax element, wherein the first syntax element indicates a purpose of a post-processing filter, and the first syntax element is an initial syntax element present in the neural network post-filter characteristics message, a second syntax element, wherein the second syntax element is an unsigned integer 0-th order Exp-Golomb-coded syntax element with a left bit first, contains an identifying number that may be used to identify a post-processing filter and the second syntax element immediately follows the first syntax element in the neural network post-filter characteristics message, and a third syntax element, wherein the third syntax element is a one bit syntax element, and the third syntax element specifies that the neural network post-filter characteristics message specifies an update to a previous neural network post-filter having a same value of the second syntax element.

Description

CROSS-REFERENCE TO RELATED APPLICATION This application is a continuation of U.S. patent application Ser. No. 18/084,346, filed on Dec. 19, 2022, the content of which is hereby incorporated by reference into this application. TECHNICAL FIELD This disclosure relates to video coding and more particularly to techniques for signaling neural network post-filter parameter information for coded video. BACKGROUND Digital video capabilities can be incorporated into a wide range of devices, including digital televisions, laptop or desktop computers, tablet computers, digital recording devices, digital media players, video gaming devices, cellular telephones, including so-called smartphones, medical imaging devices, and the like. Digital video may be coded according to a video coding standard. Video coding standards define the format of a compliant bitstream encapsulating coded video data. A compliant bitstream is a data structure that may be received and decoded by a video decoding device to generate reconstructed video data. Video coding standards may incorporate video compression techniques. Examples of video coding standards include ISO/IEC MPEG-4 Visual and ITU-T H.264 (also known as ISO/IEC MPEG-4 AVC) and High-Efficiency Video Coding (HEVC). HEVC is described in High Efficiency Video Coding (HEVC), Rec. ITU-T H.265, December 2016, which is incorporated by reference, and referred to herein as ITU-T H.265. Extensions and improvements for ITU-T H.265 are being considered for the development of next generation video coding standards. For example, the ITU-T Video Coding Experts Group (VCEG) and ISO/IEC (Moving Picture Experts Group (MPEG) (collectively referred to as the Joint Video Exploration Team (JVET)) have standardized video coding technology with a compression capability that significantly exceeds that of the current HEVC standard. The Joint Exploration Model 7 (JEM 7), Algorithm Description of Joint Exploration TestModel 7 (JEM 7), ISO/IEC JTC1/SC29/WG11 Document: JVET-G1001, July 2017, Torino, IT, which is incorporated by reference herein, describes the coding features that were under coordinated test model study by the JVET as potentially enhancing video coding technology beyond the capabilities of ITU-T H.265. It should be noted that the coding features of JEM 7 are implemented in JEM reference software. As used herein, the term JEM may collectively refer to algorithms included in JEM 7 and implementations of JEM reference software. Further, in response to a “Joint Call for Proposals on Video Compression with Capabilities beyond HEVC,” jointly issued by VCEG and MPEG, multiple descriptions of video coding tools were proposed by various groups at the 10th Meeting of ISO/IEC JTC1/SC29/WG11 16-20 Apr. 2018, San Diego, CA. From the multiple descriptions of video coding tools, a resulting initial draft text of a video coding specification is described in “Versatile Video Coding (Draft 1),” 10th Meeting of ISO/IEC JTC1/SC29/WG11 16-20 Apr. 2018, San Diego, CA, document JVET-J1001-v2, which is incorporated by reference herein, and referred to as JVET-J1001. This development of a video coding standard by the VCEG and MPEG is referred to as the Versatile Video Coding (VVC) project. “Versatile Video Coding (Draft 10),” 20th Meeting of ISO/IEC JTC1/SC29/WG11 7-16 Oct. 2020, Teleconference, document JVET-T2001-v2, which is incorporated by reference herein, and referred to as JVET-T2001, represents the current iteration of the draft text of a video coding specification corresponding to the VVC project. Video compression techniques enable data requirements for storing and transmitting video data to be reduced. Video compression techniques may reduce data requirements by exploiting the inherent redundancies in a video sequence. Video compression techniques may sub-divide a video sequence into successively smaller portions (i.e., groups of pictures within a video sequence, a picture within a group of pictures, regions within a picture, sub-regions within regions, etc.). Intra prediction coding techniques (e.g., spatial prediction techniques within a picture) and inter prediction techniques (i.e., inter-picture techniques (temporal)) may be used to generate difference values between a unit of video data to be coded and a reference unit of video data. The difference values may be referred to as residual data. Residual data may be coded as quantized transform coefficients. Syntax elements may relate residual data and a reference coding unit (e.g., intra-prediction mode indices, and motion information). Residual data and syntax elements may be entropy coded. Entropy encoded residual data and syntax elements may be included in data structures forming a compliant bitstream. SUMMARY In general, this disclosure describes various techniques for coding video data. In particular, this disclosure describes techniques for signaling neural network post-filter parameter information for coded video data. It should be noted that although