KR-20260065821-A - Warping layer format for implicit neural representation

KR20260065821AKR 20260065821 AKR20260065821 AKR 20260065821AKR-20260065821-A

Abstract

A method comprising: acquiring a first portion of a first signal; acquiring an implicit neural representation network to be learned for a second portion of a second signal; learning parameters of a neural network that implements a warping layer that allows mapping the coordinates of the second signal onto the coordinates of the first signal, wherein the learning includes minimizing a loss function representing the difference between the first signal and the warped coordinates of the first signal to which the implicit neural representation network is applied; and at least a subset of learned parameters as an index representing a dataset and an implicit neural representation network.

Inventors

헬리에, 피에르
다모다란, 바라트 부샨
램버트, 앤
슈니츨러, 프랑수아

Assignees

인터디지털 씨이 페이튼트 홀딩스, 에스에이에스

Dates

Publication Date: 20260511
Application Date: 20240902
Priority Date: 20230906

Claims (20)

As a method, Step of obtaining a first part of the first signal (600); A step (602) of acquiring an implicit neural representation network that is learned for the second part of the second signal; A step (603) of learning parameters of a neural network that implements a warping layer that allows mapping the coordinates of the second signal onto the coordinates of the first signal—the learning includes minimizing a loss function representing the difference between the first signal and the warped coordinates of the first signal to which the implicit neural representation network is applied—; and At least a subset of the above-mentioned learned parameters signaling as an index representing the data set and the implicit neural representation network (604) A method including
In paragraph 1, The first and second signals are the same signal, and the second part is encoded using the implicit neural representation network, method.
In paragraph 1 or 2, A method wherein the warping layer comprises a polynomial embedding derived from the concatenation of multiple polynomial functions and a transformation layer implemented in the form of a multi-layer perceptron, wherein the polynomial embedding and the transformation layer are defined by the learned parameters.
In paragraph 3, A method in which the above parameters are quantized and entropy-encoded, wherein the quantization and entropy-encoding of the above parameters are considered in the minimization.
In paragraph 3 or 4, A method in which, in response to a subset of the plurality of polynomial functions being signaled in the data set, a syntax element is signaled in the data set to indicate the use of the subset of the plurality, and indices of the polynomial functions of the subset of the plurality are signaled in the data set.
As a method, Step of acquiring a first signal; A step of applying a joint learning phase that allows joint learning of parameters of an implicit neural representation network and parameters of a warping layer that allows mapping coordinates of a second signal onto coordinates of a first signal - said joint learning phase includes minimizing a loss function representing the difference between the first signal and the warped coordinates of the first signal to which the implicit neural representation network is applied - ; and A step of signaling the learned parameters of the implicit neural representation network and at least a subset of the learned parameters as a data set A method including
As a method, A step (700) of obtaining parameters of a neural network that implements a warping layer that allows mapping the coordinates of the second part of the second signal onto the coordinates of the first part of the first signal from the data set; Step (701) of decoding an index representing an implicit neural representation network from the above data set; A step (702) of applying the warping layer using the acquired parameters to the coordinates of samples of the first part of the first signal to acquire warped coordinates; and Step (703) of applying the above implicit neural representation network to the above warped coordinates A method including
In Paragraph 7, The first and second signals are the same signal, and the second part is decoded using the implicit neural representation network, method.
In Article 7 or Article 8, A method wherein the warping layer comprises a polynomial embedding derived from the composition of a plurality of polynomial functions and a transformation layer implemented in the form of a multilayer perceptron, wherein the polynomial embedding and the transformation layer are defined by the parameters.
In Paragraph 9, A method in which, in response to a subset of the plurality of polynomial functions being signaled in the data set, a syntax element is signaled in the data set to indicate the use of the subset of the plurality, and indices of the polynomial functions of the subset of the plurality are signaled in the data set.
A device comprising electronic circuitry, Acquire the first part of the first signal; Acquire an implicit neural representation network that is learned for the second part of the second signal; Learning parameters of a neural network implementing a warping layer that allows mapping the coordinates of the second signal onto the coordinates of the first signal, and the learning includes minimizing a loss function representing the difference between the first signal and the warped coordinates of the first signal to which the implicit neural representation network is applied. At least a subset of the above-mentioned learned parameters to signal as an index representing the dataset and the implicit neural representation network A device configured.
In Paragraph 11, The first and second signals are the same signal, and the second part is encoded using the implicit neural representation network, device.
In Article 11 or Article 12, The above warping layer is composed of a polynomial embedding derived from the composition of multiple polynomial functions and a transformation layer implemented in the form of a multilayer perceptron, wherein the polynomial embedding and the transformation layer are defined by the learned parameters, a device.
In Paragraph 13, A device in which the above parameters are quantized and entropy-encoded, wherein the quantization and entropy encoding of the above parameters are considered in the minimization.
In paragraph 13 or 14, A device in which, in response to a subset of the plurality of polynomial functions being signaled in the data set, a syntax element is signaled in the data set to indicate the use of the subset of the plurality, and indices of the polynomial functions of the subset of the plurality are signaled in the data set.
As a device including an electronic circuit part, Acquire the first signal; Applying a joint learning phase that allows joint learning of parameters of an implicit neural representation network and parameters of a warping layer that allows mapping coordinates of a second signal onto coordinates of a first signal, wherein the joint learning phase includes minimizing a loss function representing the difference between the first signal and the warped coordinates of the first signal to which the implicit neural representation network is applied. To signal the learned parameters of the above implicit neural representation network and at least a subset of the learned parameters as a dataset A device configured.
As a device including an electronic circuit part, Obtaining parameters of a neural network implementing a warping layer that allows mapping the coordinates of the second part of the second signal onto the coordinates of the first part of the first signal from a data set; Decoding an index representing an implicit neural representation network from the above dataset; To obtain warped coordinates, the warping layer is applied to the coordinates of samples of the first part of the first signal using the obtained parameters; To apply the above implicit neural representation network to the above warped coordinates A device configured.
In Paragraph 17, The first and second signals are the same signal, and the second part is a device decoded using the implicit neural representation network.
In paragraph 17 or 18, The above warping layer is composed of a polynomial embedding derived from the composition of a plurality of polynomial functions and a transformation layer implemented in the form of a multilayer perceptron, wherein the polynomial embedding and the transformation layer are defined by the parameters, a device.
In Paragraph 19, A device in which, in response to a subset of the plurality of polynomial functions being signaled in the data set, a syntax element is signaled in the data set to indicate the use of the subset of the plurality, and indices of the polynomial functions of the subset of the plurality are signaled in the data set.

Description

Warping layer format for implicit neural representation At least one of the embodiments relates to a method and device for encoding and decoding picture or video data based generally on implicit neural representation. Implicit Neural Representation (INR)-based compression techniques are relatively new compression technologies that can be applied to 2D pictures, videos, 3D scenes, or objects. These techniques have much lower computational complexity than end-to-end neural network-based compression approaches. An INR network is a neural network composed of multiple neural layers, typically fully connected layers. Each neural layer can be described as a function that first multiplies an input signal by a tensor, adds a vector referred to as a bias, and then applies a non-linear function to the resulting values. The shape (and other characteristics) of the tensor and the types of the non-linear functions are referred to as the network architecture. Input signals can be modified by a transformation before being used as input to a neural network. This transformation can be Fourier mapping, coordinate transformation, normalization, etc. An INR network is generally used to reconstruct an input signal, assuming that the network's characteristics are learned from the input signal and provided to a decoding unit responsible for its reconstruction. Encoding INR network characteristics entails a significant cost in terms of bitrate. It is widely known that many signals overlap. For example, picture or video data contains parts that are at least partially correlated across various sections of the signal. The use of these correlations forms the basis of picture or video compression technology. It is highly likely that an INR network trained on the first signal can be used to reconstruct a second signal correlated with the first signal (with slight adaptations if necessary). However, few solutions have been proposed to reuse an INR network trained on the first signal for a second signal correlated with the first signal. This would allow for a reduction in the signaling cost of INR network characteristics. It is desirable to propose solutions that allow for overcoming the aforementioned issues. In particular, it is desirable to propose solutions that allow for reducing the signaling cost of INR network characteristics by reusing an INR network trained on a first signal for second signals correlated with the first signal. In a first aspect, one or more of the embodiments provide a method for encoding comprising the following: Acquiring the first part of the first signal; Acquiring an implicit neural representation network that is learned for the second part of the second signal; Learning the parameters of a neural network that implements a warping layer that allows mapping the coordinates of a second signal onto the coordinates of a first signal—the learning includes minimizing a loss function representing the difference between the first signal and the warped coordinates of the first signal to which the implicit neural representation network is applied—; and At least signaling a subset of learned parameters as an index representing the dataset and the implicit neural representation network. In one embodiment, the first and second signals are the same signal, and the second part is encoded using an implicit neural representation network. In one embodiment, the warping layer is composed of a polynomial embedding derived from the concatenation of a plurality of polynomial functions and a transformation layer implemented in the form of a multi-layer perceptron, wherein the polynomial embedding and the transformation layer are defined by learned parameters. In one embodiment, parameters are quantized and entropy-encoded, wherein the quantization and entropy encoding of the parameters are considered in the minimization. In one embodiment, in response to a subset of multiple polynomial functions being signaled in a data set, a syntax element is signaled in the data set to indicate the use of the subset of multiple things, and indices of the polynomial functions of the subset of multiple things are signaled in the data set. In a second aspect, one or more of the embodiments provide a method for encoding comprising the following: Acquiring the first signal; Applying a joint learning phase that allows joint learning of the parameters of an implicit neural representation network and the parameters of a warping layer that allows mapping the coordinates of a second signal onto the coordinates of a first signal - the joint learning phase includes minimizing a loss function representing the difference between the first signal and the warped coordinates of the first signal to which the implicit neural representation network is applied - ; and Signaling the learned parameters of an implicit neural representation network and at least a subset of the learned parameters as a dataset. In a third aspect, one or more of the embodiments provide a