CN-121982273-A - Three-dimensional Gaussian splatter indoor scene three-dimensional reconstruction method and system based on plane priori guidance

CN121982273ACN 121982273 ACN121982273 ACN 121982273ACN-121982273-A

Abstract

The invention relates to a three-dimensional reconstruction method and system for a three-dimensional Gaussian splatter indoor scene based on plane priori guidance, and belongs to the technical field of three-dimensional reconstruction and computer graphics. The method aims to solve the problems that in the reconstruction of indoor scenes, particularly in a weak texture area, the geometric reconstruction of the existing method is inaccurate and artifacts are easy to generate. Firstly, multi-view images and camera parameters thereof are acquired, depth and normal priori are predicted, and plane areas are extracted through multi-granularity segmentation and geometric consistency check. And generating a dense point cloud by depth priori back projection, and initializing a three-dimensional Gaussian splatter model based on the down-sampling point cloud. Finally, under the micro-renderable frame, the model is optimized by combining the planar region geometric constraint, the non-planar region geometric constraint and the global geometric constraint, and reconstruction is completed. The invention effectively improves the reconstruction precision of the weak texture plane area, enhances the detail recovery capability of the complex structure area, and realizes the reconstruction result with high visual fidelity and geometric accuracy.

Inventors

QIN HONGXING
WANG LIANGTING

Assignees

重庆大学

Dates

Publication Date: 20260505
Application Date: 20260407

Claims (10)

1. A three-dimensional Gaussian splatter indoor scene three-dimensional reconstruction method based on plane priori guidance is characterized by comprising the following steps: acquiring a multi-view image sequence of an indoor scene to be reconstructed and corresponding camera parameters; predicting a depth priori and a normal priori from an input image of the multi-view image sequence through a monocular depth estimation model and a normal estimation model respectively; Performing multi-granularity segmentation on the input image and the normal prior map by using an image segmentation model to generate a candidate region mask, and performing geometric consistency check on the candidate region by combining the normal prior to extract a plane prior region meeting plane structure characteristics; back-projecting image pixels into a three-dimensional space according to the depth prior and the camera parameters to generate dense point clouds, carrying out voxel downsampling on the dense point clouds, and constructing a three-dimensional Gaussian splatter model based on downsampled point clouds; And under the three-dimensional Gaussian splatter micro-renderable frame, optimizing the three-dimensional Gaussian splatter model through plane area geometric constraint, non-plane area geometric constraint and global geometric constraint to complete three-dimensional reconstruction of the scene.
2. The three-dimensional reconstruction method of a three-dimensional Gaussian splatter indoor scene based on planar prior guidance of claim 1, wherein the step of generating a candidate region mask by multi-granularity segmentation of the input image and a normal prior graph using an image segmentation model comprises: Performing semantic level segmentation on an input image by using the image segmentation model, generating a semantic level candidate region mask, performing geometric consistency test on the semantic level candidate region mask, confirming the checked region as a semantic level plane region, and removing the checked region from the remaining effective region; Performing instance level segmentation on the remaining effective area after the semantic level plane area is removed by using the image segmentation model, generating instance level candidate area masks, performing geometric consistency test on the instance level candidate area masks, confirming the checked area as an instance level plane area and removing the instance level plane area from the remaining effective area; Performing component level segmentation on the residual effective area after the semantic level plane area and the instance level plane area are removed by using the image segmentation model, generating a component level candidate area mask, performing geometric consistency test on the component level candidate area mask, and confirming the area passing the test as a component level plane area; And summarizing the semantic level plane area, the instance level plane area and the component level plane area to form a global plane prior mask.
3. The three-dimensional reconstruction method of a three-dimensional Gaussian splatter indoor scene based on planar prior guidance of claim 1, wherein the geometric consistency check comprises: for each candidate region mask, judging whether the candidate region mask spans the geometric edge; If the candidate region mask spans the geometric edge, cutting the candidate region mask by taking the geometric edge as a boundary to obtain an independent sub-region; and filtering out a mask area with the pixel size smaller than a preset threshold value to obtain a final plane area.
4. The three-dimensional reconstruction method of the three-dimensional Gaussian splatter indoor scene based on plane priori guidance of claim 1, wherein the plane region geometric constraint comprises a plane depth coplanarity constraint and a plane normal consistency constraint; wherein the planar depth coplanarity constraint is achieved by applying a planar parametric model fit with L1 loss to the planar prior region, and the planar normal consistency constraint is achieved by applying normal map gradient smoothing inside the planar prior region.
5. The three-dimensional reconstruction method of a three-dimensional Gaussian splatter indoor scene based on planar prior guidance as set forth in claim 4, wherein said planar depth coplanar constraint loss function Expressed as: Wherein, the A set of pixels representing all planar prior regions, u representing pixel coordinates, D (u) representing depth values rendered by a three-dimensional gaussian splatter model, Representing an ideal coplanar depth calculated based on the planar parameters of the planar prior region.
6. The three-dimensional reconstruction method of the three-dimensional Gaussian splatter indoor scene based on plane prior guidance as set forth in claim 4, wherein the loss function of the plane normal consistency constraint is that Expressed as: where N represents a normal map resulting from gaussian rendering, For a joint mask in the horizontal direction, Is a horizontal mask in the vertical direction.
7. The three-dimensional reconstruction method of the three-dimensional Gaussian splatter indoor scene based on plane prior guidance as set forth in claim 1, wherein the non-planar region geometric constraint realizes local geometric smoothing by constructing a normal weighted fusion depth based on local neighborhood pixels for pixels in the non-planar region, and a loss function thereof Expressed as: Wherein, the Representing a set of non-planar pixels, Representing depth values rendered by a three-dimensional gaussian splatter model, Representing the image represented by the target pixel The ideal depth obtained by weighting and fusing the neighborhood pixels is calculated as follows , For the target pixel Is used for the set of neighborhood pixels, For neighborhood pixels Is used for the weight of the (c), For pixels in the neighborhood Is the normal of (2) And three-dimensional coordinates Calculated pair target pixel Is used for the prediction depth of (3).
8. The three-dimensional reconstruction method of the three-dimensional Gaussian splatter indoor scene based on plane priori guidance of claim 1, wherein the global geometric constraints comprise depth priori constraints, normal priori constraints and depth normal consistency constraints; wherein the depth a priori constrained loss function Is that Loss function of normal prior constraint Is that Loss function of depth normal consistency constraint Is that ; Wherein the method comprises the steps of Representing a set of all pixels of the image, D representing a rendering depth map, Represents a depth prior map, N represents a rendering normal map, Representing a prior map of the normal line, Representing a surface normal map derived from the rendering depth map.
9. The three-dimensional reconstruction method of the three-dimensional Gaussian splatter indoor scene based on plane prior guidance as set forth in claim 1, wherein the three-dimensional Gaussian splatter model is subjected to joint optimization of a total loss function Expressed as: Wherein, the A loss of photometric consistency under the micro-renderable framework for three-dimensional gaussian splatter, For planar depth coplanar constraint loss, For the loss of plane normal consistency constraint, For the loss of geometric constraint of non-planar regions, For the depth a priori constraint loss, For the normal prior constraint loss, For the depth normal consistency constraint to be lost, 、、、、、 And (5) presetting a weight coefficient corresponding to each loss term.
10. A three-dimensional Gaussian splatter indoor scene three-dimensional reconstruction system based on plane priori guidance is characterized by comprising: The image and parameter acquisition module is used for acquiring a multi-view image sequence of the indoor scene to be reconstructed and corresponding camera parameters; The prior information extraction module is used for predicting depth prior and normal prior from the input images of the multi-view image sequence through a monocular depth estimation model and a normal estimation model respectively; the plane prior extraction module is used for carrying out multi-granularity segmentation on the input image by utilizing an image segmentation model to generate a candidate region mask, and carrying out geometric consistency check on the candidate region by combining the normal prior so as to extract a plane prior region meeting plane structure characteristics; The model initialization module is used for back-projecting image pixels into a three-dimensional space according to the depth prior and the camera parameters to generate dense point clouds, carrying out voxel downsampling on the dense point clouds, and constructing a three-dimensional Gaussian splatter model based on downsampled point clouds; and the model optimization reconstruction module is used for optimizing the three-dimensional Gaussian splatter model through plane area geometric constraint, non-plane area geometric constraint and global geometric constraint under the three-dimensional Gaussian splatter micro-renderable frame so as to finish three-dimensional reconstruction of a scene.

Description

Three-dimensional Gaussian splatter indoor scene three-dimensional reconstruction method and system based on plane priori guidance Technical Field The invention belongs to the technical field of three-dimensional reconstruction and computer graphics, and relates to a three-dimensional reconstruction method and a three-dimensional reconstruction system for a three-dimensional Gaussian splatter indoor scene based on plane priori guidance. Background With the development of Virtual Reality (VR), augmented Reality (Augmented Reality, AR) and automated robotics, high-precision three-dimensional digital modeling of real-world scenes has become an important research direction. The conventional three-dimensional reconstruction method is mainly based on Multi-View stereovision (MVS) technology, and restores a three-dimensional structure through image feature matching. However, in a weak texture area or an environment with large illumination variation, the method is difficult to stably work. In recent years, the neural radiation field (Neural RADIANCE FIELD, NERF) realizes high-quality new view synthesis through a volume rendering technology, but because the neural radiation field adopts a Multi-Layer Perceptron (MLP) for implicit scene representation, the training and reasoning calculation cost is high, and real-time rendering is difficult to realize. The three-dimensional Gaussian splats (3D Gaussian Splatting, 3 DGS) technology effectively solves the problem of high NeRF calculation cost by using an anisotropic three-dimensional Gaussian ellipsoid as a scene representation unit and realizing high-efficiency real-time rendering through micro-rasterization. However, the existing 3DGS method mainly relies on luminosity consistency for optimization, lacks explicit constraint on geometric structures, and is easy to cause problems of geometric collapse, floating point artifacts, surface distortion and the like in the indoor scene reconstruction process. Indoor environments often have significant structural features, such as walls, floors, and desktops, which are often present as large-area planes. In order to introduce a planar geometric prior in the three-dimensional Gaussian splatter optimization process, so that the geometric precision of scene reconstruction is effectively improved, the invention provides a three-dimensional reconstruction method and system for a three-dimensional Gaussian splatter indoor scene based on planar prior guidance. Disclosure of Invention In view of the above, the present invention aims to provide a three-dimensional reconstruction method and system for three-dimensional gaussian splatter indoor scene based on planar prior guidance. In order to achieve the above purpose, the present invention provides the following technical solutions: A three-dimensional Gaussian splatter indoor scene three-dimensional reconstruction method based on plane priori guidance comprises the following steps: acquiring a multi-view image sequence of an indoor scene to be reconstructed and corresponding camera parameters; predicting a depth priori and a normal priori from an input image of the multi-view image sequence through a monocular depth estimation model and a normal estimation model respectively; Performing multi-granularity segmentation on the input image and the normal prior map by using an image segmentation model to generate a candidate region mask, and performing geometric consistency check on the candidate region by combining the normal prior to extract a plane prior region meeting plane structure characteristics; back-projecting image pixels into a three-dimensional space according to the depth prior and the camera parameters to generate dense point clouds, carrying out voxel downsampling on the dense point clouds, and constructing a three-dimensional Gaussian splatter model based on downsampled point clouds; And under the three-dimensional Gaussian splatter micro-renderable frame, optimizing the three-dimensional Gaussian splatter model through plane area geometric constraint, non-plane area geometric constraint and global geometric constraint to complete three-dimensional reconstruction of the scene. Further, the step of generating the candidate region mask includes performing multi-granularity segmentation on the input image and the normal prior map using an image segmentation model, the step of generating the candidate region mask including: Performing semantic level segmentation on an input image by using the image segmentation model, generating a semantic level candidate region mask, performing geometric consistency test on the semantic level candidate region mask, confirming the checked region as a semantic level plane region, and removing the checked region from the remaining effective region; Performing instance level segmentation on the remaining effective area after the semantic level plane area is removed by using the image segmentation model, generating instance level candidate area ma