Search

US-20260129232-A1 - METHOD OF ENCODING/DECODING GAUSSIAN SPLAT FOR 3D SPACE REPRESENTATION

US20260129232A1US 20260129232 A1US20260129232 A1US 20260129232A1US-20260129232-A1

Abstract

According to a method of encoding a Gaussian splat for 3D space representation of the present disclosure, the method comprising generating model parameters of Gaussian splats; generating at least one video frame based on the model parameters; and encoding the at least one video frame, wherein type information indicating a type of a model parameter included in a video frame is encoded as a metadata.

Inventors

  • Gwangsoon Lee
  • Kwan Jung Oh
  • Jun Young JEONG

Assignees

  • ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Dates

Publication Date
20260507
Application Date
20251030
Priority Date
20241101

Claims (16)

  1. 1 . A method of encoding a Gaussian splat for 3D space representation, the method comprising: generating model parameters of Gaussian splats; generating at least one video frame based on the model parameters; and encoding the at least one video frame, wherein type information indicating a type of a model parameter included in a video frame is encoded as a metadata.
  2. 2 . The method of claim 1 , wherein each model parameter is packed into a separate video frame, and wherein the type information is encoded for each of video frame.
  3. 3 . The method of claim 2 , wherein a first video frame comprises a first model parameter, and wherein each of sub-parameters of the first parameter is packed into a separate tile in the first video frame.
  4. 4 . The method of claim 1 , wherein information for identifying the sub-parameters comprised in the first video frame is encoded.
  5. 5 . The method of claim 2 , wherein a first frame comprises a first model parameter, and wherein each of sub-parameters of the first model parameter is packed into a separate channel of the first video frame.
  6. 6 . The method of claim 2 , wherein high-bit data of a first model parameter is packed into a first video frame and low-bit data of the first model parameter is packed into a second video frame.
  7. 7 . The method of claim 6 , wherein the first model parameter represents position information of the Gaussian splats.
  8. 8 . The method of claim 1 , wherein a pixel value in the video frame is obtained by performing a non-linear transform on the model parameter.
  9. 9 . The method of claim 8 , wherein the non-linear transform is performed based on piece-wise linear scaling or a multiple order non-linear function.
  10. 10 . The method of claim 9 , wherein information on the piece-wise linear scaling or the multiple order non-linear function is encoded as a metadata, and wherein the information comprises at least one of a number of piece intervals or coefficients of the non-linear function.
  11. 11 . The method of claim 1 , wherein a first model parameter is categorized into a plurality of levels, and wherein each of the plurality of levels is packed into a separate region in the video frame or a separate video frame.
  12. 12 . The method of claim 1 , wherein a codebook is generated by assigning a code value to each of groups, the groups being obtained by categorizing data of a first model parameter, wherein a first video frame is generated based on a label of attribute values constituting the first model parameter, and wherein the label indicates a code value corresponding to an attribute value.
  13. 13 . The method of claim 12 , wherein the first mode parameter represents a high-order spherical harmonic function.
  14. 14 . The method of claim 12 , wherein the codebook is encoded separately with the first video frame as a metadata.
  15. 15 . A method of decoding a Gaussian splat for 3D space representation, the method comprising: decoding at least one video frame; reconstructing model parameters of Gaussian splats from at least one decoded video frame; and rendering an image of target viewpoint based on the model parameters of the Gaussian splats, wherein type information indicating a type of a model parameter included in a video frame is decoded as a metadata.
  16. 16 . A non-transitory computer readable medium storing instructions when executed cause the computer to carry out: generate model parameters of Gaussian splats; generate at least one video frame based on the model parameters; and encode the at least one video frame, wherein type information indicating a type of a model parameter included in a video frame is encoded as a metadata.

Description

BACKGROUND OF THE INVENTION Field of the Invention The present disclosure related to a method of encoding/decoding a gaussian splat for 3D space representation. Description of the Related Art 3D Gaussian splatting is a technology that models the radiance field of a 3D space as a collection of 3D Gaussians, rendering the 3D space into a 2D image. Each GS (Gaussian Splat) consists of multiple attribute information, and the amount of data per GS is enormous. Consequently, various studies are currently underway to effectively compress the data representing the GS. SUMMARY OF THE INVENTION It is an object of the present disclosure to provide a method for encoding/decoding model parameters of a Gaussian splat into 2D images for 3D space representation. It is a further object of the present disclosure to provide a method for packing model parameters of a Gaussian splat into a plurality of regions and encoding/decoding metadata for the same. It is a further object of the present disclosure to provide a method for classifying model parameters of a Gaussian splat into a plurality of levels, packing data for each level into a plurality of regions, and encoding/decoding metadata for the same. It is a further object of the present disclosure to provide a method for encoding/decoding model parameters of a Gaussian splat based on a codebook and a label. The technical problems to be achieved by the present disclosure are not limited to the technical problems mentioned above, and other technical problems not mentioned herein may be clearly understood by those skilled in the art from the description below. In accordance with an aspect of the present disclosure, the above and other objects can be accomplished by the provision of a method of encoding a Gaussian splat for 3D space representation, the method comprising generating model parameters of Gaussian splats; generating at least one video frame based on the model parameters; and encoding the at least one video frame, wherein type information indicating a type of a model parameter included in a video frame is encoded as a metadata. In the method of encoding a Gaussian splat for 3D space representation according to the present disclosure, each model parameter is packed into a separate video frame, and the type information is encoded for each of video frame. In the method of encoding a Gaussian splat for 3D space representation according to the present disclosure, each model parameter is packed into a separate video frame, a first video frame comprises a first model parameter, and each of sub-parameters of the first parameter is packed into a separate tile in the first video frame. In the method of encoding a Gaussian splat for 3D space representation according to the present disclosure, each model parameter is packed into a separate video frame, information for identifying the sub-parameters comprised in the first video frame is encoded. In the method of encoding a Gaussian splat for 3D space representation according to the present disclosure, each model parameter is packed into a separate video frame, a first frame comprises a first model parameter, and each of sub-parameters of the first model parameter is packed into a separate channel of the first video frame. In the method of encoding a Gaussian splat for 3D space representation according to the present disclosure, each model parameter is packed into a separate video frame, high-bit data of a first model parameter is packed into a first video frame and low-bit data of the first model parameter is packed into a second video frame. In the method of encoding a Gaussian splat for 3D space representation according to the present disclosure, each model parameter is packed into a separate video frame, the first model parameter represents position information of the Gaussian splats. In the method of encoding a Gaussian splat for 3D space representation according to the present disclosure, each model parameter is packed into a separate video frame, a pixel value in the video frame is obtained by performing a non-linear transform on the model parameter. In the method of encoding a Gaussian splat for 3D space representation according to the present disclosure, each model parameter is packed into a separate video frame, the non-linear transform is performed based on piece-wise linear scaling or a multiple order non-linear function. In the method of encoding a Gaussian splat for 3D space representation according to the present disclosure, each model parameter is packed into a separate video frame, information on the piece-wise linear scaling or the multiple order non-linear function is encoded as a metadata, and the information comprises at least one of a number of piece intervals or coefficients of the non-linear function. In the method of encoding a Gaussian splat for 3D space representation according to the present disclosure, each model parameter is packed into a separate video frame, a first model parameter is categorized into