KR-102965352-B1 - METHOD FOR OPERATING AN APPARATUS FOR THREE-DIMENSIONAL(3D) DATA GENERATION AND DIGITAL COMPOSITING USING GAUSSIAN SPLATTING

KR102965352B1KR 102965352 B1KR102965352 B1KR 102965352B1KR-102965352-B1

Abstract

A method of operating an electronic device is disclosed. The method of operating an electronic device according to the present disclosure includes the steps of: acquiring a plurality of images of a target area captured from multiple angles for a set period of time from a capturing device; acquiring 3D data by inputting the plurality of images into a Gaussian splatting algorithm; acquiring tracking data including at least one of position information and rotation information of a camera that captured reference data from at least one device; acquiring target 3D data by matching the 3D data with the tracking data; and acquiring final data by synthesizing the target 3D data with the reference data.

Inventors

최종대

Dates

Publication Date: 20260513
Application Date: 20251031

Claims (7)

The method of operation of an electronic device is, A step of acquiring multiple images of a target area captured from multiple angles for a certain period of time using a capturing device; A step of acquiring 3D data by inputting the above plurality of images into a Gaussian splatting algorithm; A step of acquiring tracking data including at least one of position information and rotation information of a camera corresponding to reference data; A step of acquiring target 3D data by matching the above 3D data and tracking data; and The method includes the step of obtaining final data by synthesizing the target 3D data with the reference data. The step of acquiring the above-mentioned target 3D data is, A step of identifying a transformation matrix by integrating a translation vector calculated based on the position information of the reference data and a rotation matrix calculated based on the rotation information; and The method includes the step of obtaining the target 3D data by matching the transformation matrix with the 3D data; The method of operation of the above electronic device is, The method further includes the step of generating visibility information based on at least one of the target 3D data and the reference data. The step of obtaining the above final data is, A method of operation of an electronic device for obtaining the final data by synthesizing the target 3D data and the reference data based on the above visibility information.
In paragraph 1, The step of acquiring the above 3D data is, Extracting feature points from the above plurality of images, and Depth information is obtained from the above feature points, and Generate a point cloud from the above depth information, and A method of operation of an electronic device for acquiring 3D data by modeling the above point cloud as a Gaussian distribution.
delete
In paragraph 1, The method of operation of the above electronic device is, A step of detecting an artifact comprising at least one of jump spreading, boundary blurring, and pixel distortion of the above-mentioned target 3D data: and A method of operating an electronic device, further comprising the step of correcting the target 3D data by removing the artifact.
In paragraph 1, The method of operation of the above electronic device is, A method of operating an electronic device, further comprising the step of correcting at least one of the color, brightness, and resolution of the target 3D data based on the reference data.
In paragraph 1, The method of operation of the above electronic device is, A method of operating an electronic device, further comprising the step of correcting noise in the target 3D data based on the reference data.
In paragraph 1, The method of operation of the above electronic device is, A step of obtaining a depth map containing distance information for each pixel relative to the origin of a virtual coordinate system from the above target 3D data; and A method of operating an electronic device, further comprising the step of applying visual effects to the target 3D data based on the depth map to correct it.

Description

Method for operating an apparatus for three-dimensional data generation and digital synthesis using Gaussian splatting The present disclosure relates to a three-dimensional (3D) or three-dimensional image processing technology including a time axis, and more specifically, to a method of operation of an electronic device that generates static 3D data or dynamic 3D data including a time axis (hereinafter collectively referred to as '3D data') using a Gaussian splatting technique from a plurality of images, and then digitally composites the data with reference data to implement a parallax effect according to camera movement and obtain high-quality final data. With the recent advancements in AI-based computer vision technology and reality-based 3D reconstruction technology, production pipelines in the fields of visual effects and digital compositing are evolving in new ways. Digital synthesis refers to the process of producing final data in a single integrated image or video format by aligning and correcting visual data with reference data. In this specification, reference data refers to base data on which visual data is synthesized, and may include images, actual captured video, computer graphics video, Gaussian splatting 3D data, video generated using artificial intelligence (AI), etc., or data generated by a combination thereof may be included as an example. Accordingly, each embodiment illustrated in the drawings also encompasses all such various variations. Furthermore, digital compositing serves as a critical step in the visual effects production pipeline for generating final results and can be utilized to correct errors that occur during the filming process. When combined with various compositing software and rendering equipment, it enhances work efficiency and contributes to reducing production time and costs. As such, digital compositing is being utilized as a key technology across the entire field of video production, including film, advertising, and metaverse content. One of the critical technical challenges in digital compositing is to naturally express parallax effects caused by camera movement. Parallax is a visual phenomenon resulting from changes in the relative positions of various objects within a background, such as buildings and trees, depending on shifts in the observer's viewpoint; it is a key element in providing depth and realism to an image. While basic parallax effects can be implemented by adjusting the depth differences of each object in a 2D environment, there are limitations in expressing changes in object appearance due to camera movement or complex 3D structures. To overcome these limitations, reality-based 3D compositing rather than 2D is required, necessitating reconstruction technology that converts the actual environment into a 3D model. Early conventional research on 3D reconstruction focused on light fields and basic 3D reconstruction techniques, which had limitations in handling complex scenes or lighting conditions. Subsequently, the development of Structure-from-Motion (SfM) and Multi-View Stereo technologies used in photogrammetry enabled more sophisticated 3D reconstruction, but there were still limitations in generating complex scenes. Neural Radiance Fields (NeRF), proposed to address this, could generate realistic 3D scenes by mapping spatial coordinates to color and density using deep learning; however, it had problems such as requiring long training times and being difficult to modify due to implicit representation characteristics. Gaussian Splatting, proposed to overcome these limitations, is based on an explicit representation method utilizing millions of Gaussians and enables efficient computation and real-time rendering through parallel processing. This technology generates high-quality 3D models by using images captured from various angles as input and approximating the surfaces of objects or environments through a Gaussian distribution. Furthermore, unlike conventional NeRF which relies on complex volume ray marching, it reduces unnecessary computation by directly projecting Gaussians onto the image plane; 'Gaussian-SLAM,' a Gaussian Splatting-based model, has achieved rendering speeds approximately 578 times faster than 'Point-SLAM,' a NeRF-based model. Thanks to these characteristics, Gaussian Splatting allows for flexible processing even in scenes containing complex lighting or various objects. Furthermore, research is actively underway to extend Gaussian splatting, which has been limited to static scenes, to dynamic scenes that include the time axis. So-called 4D Gaussian Splatting adopts a method that maintains a single canonical 3D Gaussian set while modeling the movement and shape changes of objects along the time axis using a Gaussian Deformation Field. This allows for the efficient reflection of spatiotemporal changes in characters, objects, and environments while maintaining learning and rendering efficiency. In particular, since 4D Gaussian Splatt