KR-20260065561-A - METHOD OF ENCODING/DECODING GAUSSIAN SPLAT FOR 3D SPACE REPRESENTATION

KR20260065561AKR 20260065561 AKR20260065561 AKR 20260065561AKR-20260065561-A

Abstract

A method for encoding a Gaussian splat for a three-dimensional spatial representation according to the present disclosure may include: generating model parameters of Gaussian splats; generating at least one video frame based on the model parameters; and encoding the at least one video frame. In this case, type information indicating the type of the model parameters included in the video frame may be encoded as metadata.

Inventors

이광순
오관정
정준영

Assignees

한국전자통신연구원

Dates

Publication Date: 20260508
Application Date: 20251030
Priority Date: 20241101

Claims (16)

Step of generating model parameters of Gaussian splats; Based on the above model parameters, a step of generating at least one video frame; and The step of encoding at least one video frame, wherein A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that type information indicating the type of model parameters included in a video frame is encoded as metadata.
In Article 1, Each of the above model parameters is packed into a different video frame, and A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that the above type information is encoded per video frame.
In Article 2, The first video frame includes the first model parameters, and A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that each of the sub-parameters of the first model parameter is packed into different tiles within the first video frame.
In Paragraph 3, A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that information for identifying the sub-parameters included in the first video frame is encoded.
In Article 2, The first video frame includes the first model parameters, and A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that each of the sub-parameters of the first model parameter is packed into different channels of the first video frame.
In Article 2, The high-bit data of the first model parameter is packed into the first video frame, and A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that the low-bit data of the first model parameter is packed into a second video frame.
In Article 6, A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that the first model parameter is position information of the Gaussian splats.
In Article 1, A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that pixel values within a video frame are obtained by non-linearly transforming model parameters.
In Article 8, A method for encoding a Gaussian splat for a 3D spatial representation, characterized in that the above non-linear transformation is performed based on piece-wise linear scaling or a multi-order non-linear function.
In Article 9, Information regarding the above-mentioned segment-specific linear scaling or the above-mentioned multi-order non-linear function is encoded as metadata, and A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that the above information includes at least one of the number of segment intervals or the coefficients of a non-linear function.
In Article 1, The first model parameter is classified into multiple levels, and A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that the classified levels are packed into different regions within a video frame or packed into different video frames.
In Article 1, After classifying the data of the first model parameter into multiple clusters, a codebook is generated by assigning a code value to each of the multiple clusters, and A first video frame is generated based on the label value of each of the attribute values constituting the first model parameter, and A method for encoding a Gaussian splat for a three-dimensional space representation, characterized in that the above label value indicates a code value corresponding to the above attribute value.
In Article 12, A method for encoding a Gaussian splat for a three-dimensional space representation, characterized in that the first model parameter is a high-order spherical harmonic function.
In Article 12, A method for encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that the above codebook is encoded separately from the first video frame as metadata.
A step of decoding at least one video frame; A step of restoring model parameters of Gaussian splats from at least one decoded video frame; and The method includes the step of rendering a target viewpoint image using the model parameters of the above Gaussian splatters, A method for decoding a Gaussian splat for a three-dimensional spatial representation, characterized in that type information indicating the type of model parameters included in a video frame is decoded as metadata.
Step of generating model parameters of Gaussian splats; Based on the above model parameters, a step of generating at least one video frame; and The step of encoding at least one video frame, wherein A computer-readable recording medium having instructions for performing a method of encoding a Gaussian splat for a three-dimensional spatial representation, characterized in that type information indicating the type of model parameters included in a video frame is encoded as metadata.

Description

Method of Encoding/Decoding Gaussian Splat for 3D Space Representation The present disclosure relates to a method for encoding/decoding a Gaussian splat for a three-dimensional spatial representation and an apparatus for performing the same. 3D GS (Gaussian Splatting) is a technology that models the radiance field of a 3D space as a set of 3D Gaussians and renders the 3D space into a 2D image. In this case, a single GS consists of multiple attribute information, and the amount of data for a single GS is vast. Accordingly, various studies are currently underway to effectively compress this data. Figure 1 illustrates a radiance field-based 3D spatial representation and video rendering method. FIG. 2 is a block diagram illustrating the structure of an encoder and a decoder for providing a radiance field-based 3D service according to one embodiment of the present disclosure. Figure 3 shows an example of model parameters of Gaussian splats converted into 2D images. Figure 4 shows an example in which non-linear normalization based on interval-by-interval linear scaling is performed. Figure 5 is an example illustrating a video frame structure generated by transforming model parameters. Figure 6 shows an example in which model parameters or sub-parameters of Gaussian splats are packed into different channels. Figure 7 shows an example in which model parameters of different levels are packed into different tiles. Figure 8 is a flowchart of a method for packing high-order spherical harmonic functions into a video frame. Figure 9 shows an example of encoding the spherical harmonic function coefficients of a Gaussian splat based on a codebook and labels. The present disclosure is subject to various modifications and may have various embodiments, and specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present disclosure to specific embodiments, and it should be understood that it includes all modifications, equivalents, and substitutions that fall within the spirit and scope of the present disclosure. Similar reference numerals in the drawings refer to the same or similar functions across various aspects. The shapes and sizes of elements in the drawings may be exaggerated for clearer explanation. The detailed description of exemplary embodiments described below refers to the accompanying drawings, which illustrate specific embodiments as examples. These embodiments are described in sufficient detail to enable those skilled in the art to practice the embodiments. It should be understood that various embodiments are different but need not be mutually exclusive. For example, specific shapes, structures, and characteristics described herein may be implemented in other embodiments without departing from the spirit and scope of the present disclosure in relation to one embodiment. It should also be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the embodiment. Accordingly, the following detailed description is not intended to be taken in a limiting sense, and the scope of exemplary embodiments is limited only by the appended claims, together with all equivalents to those claimed therein, provided they are properly described. In this disclosure, terms such as first, second, etc. may be used to describe various components, but said components should not be limited by said terms. Such terms are used solely for the purpose of distinguishing one component from another. For example, without departing from the scope of this disclosure, the first component may be named the second component, and similarly, the second component may be named the first component. The term "and/or" includes a combination of a plurality of related described items or any of a plurality of related described items. Where it is stated that any component of the present disclosure is "connected" or "connected" to another component, it should be understood that it may be directly connected or connected to that other component, or that there may be other components in between. On the other hand, where it is stated that a component is "directly connected" or "directly connected" to another component, it should be understood that there are no other components in between. The components shown in the embodiments of the present disclosure are depicted independently to represent different characteristic functions and do not imply that each component consists of separate hardware or a single software unit. That is, each component is listed and included as a separate component for convenience of explanation; however, at least two of the components may be combined to form a single component, or a single component may be divided into multiple components to perform a function, and such integrated and separated embodiments of each component are included within the s