CN-121999111-A - Sparse view three-dimensional Gaussian reconstruction method based on asymmetric double-model regularization and opacity disturbance
Abstract
The invention relates to a sparse view three-dimensional Gaussian reconstruction method based on asymmetric double-model regularization and opacity disturbance, and belongs to the technical field of three-dimensional reconstruction and computer vision. The method comprises the following steps of S1 input preprocessing, S2 double-model initialization, S3 asymmetrical double-model regularization training, asymmetrical densification and pruning, collaborative geometric filtering, depth consistency constraint, S4 opacity disturbance regularization, random disturbance, global opacity constraint, disturbance luminosity reconstruction loss calculation and S5 output and application. The core innovation of the invention is that an asymmetric dual-model regularization frame and an opacity disturbance regularization mechanism are provided, wherein the former realizes cross-model artifact suppression and structure recovery through geometric consistency constraint, and the latter breaks the opacity coupling of artifacts and structure gauss through disturbance and global constraint, and the two cooperate to form a complete sparse view three-dimensional reconstruction regularization system.
Inventors
- CAI SHUTING
- CHEN WENJUN
- LI HUIHUI
- XIE XIAOEN
- HUANG ZHEN
Assignees
- 广东工业大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260128
Claims (8)
- 1. The sparse view three-dimensional Gaussian reconstruction method based on asymmetric double-model regularization and opacity disturbance is characterized by comprising the following steps of: S1, input preprocessing, namely acquiring a sparse view input image set I= { I v }, and estimating camera external parameters (R v , t v ) and sparse point cloud P= { P i } by a structured light or motion recovery structure method; S2, double-model initialization, namely parameterizing a Gaussian set into a position mu, a covariance sigma, a color f and an opacity alpha, and respectively initializing two groups of models G 0 and G 1 based on a sparse point cloud P; s3, asymmetric double-model regularization training: Asymmetric densification and pruning, namely, for G 0 , a densification threshold value theta 0 _split=0.0008, a pruning threshold value p 0 _split=0.005, for G 1 , a densification threshold value theta 1 _split=0.0025, a pruning threshold value p 1 _split=0.02, when the reconstruction error e i = ||I v -R (G; v) I2/N_pixel of any Gaussian exceeds theta_split, a splitting operation is performed, and the Gaussian is decomposed into two sub-gauss, wherein mu ' =mu+ -delta mu, sigma ' =0.5 sigma, alpha ' =alpha/2, wherein delta mu= (sigma) {1/2 }. Zeta, zeta is standard normal distribution noise, and when alpha i < p_split is deleted; Collaborative geometry filtering, in each iteration, for each Gaussian center c i 0 in G 0 , find the nearest neighbor Gaussian center c j ¹:δ i 0 →¹= ||c i 0 - c j ¹|| 2 in G 1 , if δ i 0 →1> η_geo (t), then consider it spatially inconsistent; For each input view v, rendering a depth map D 0 (v) and a depth map D 1 (v) of the two models, and calculating a depth difference loss, namely L_depth (k) = (1/|Ω|) Σ p |D 0 (p) - D 1 (p) |, and k epsilon {0,1}; S4, opacity disturbance regularization training: random perturbation (Opacity Randomization) in each iteration, a random number is generated for each Gaussian i U (0, 1) if r i < p_reset, then a i is temporarily zeroed, i.e =0, Otherwise = α i ; Global opacity constraint (Global Opacity Clamping) to prevent overexposure caused by partial gaussian compensation perturbations, all α i undergo a truncation operation: α i ← min(σ(α i ), α_max), where α_max=0.8, σ (·) is a Sigmoid function; Disturbance luminosity reconstruction loss calculation (Perturbed Photometric Loss) rendering using post-disturbance opacity set The luminosity loss is defined as L_photo OPR= Σ v ∈V ∑ p || I v (p) -R # )(p) ||2; S5, outputting and applying, wherein the final output model is a main model G 0 and comprises a Gaussian parameter set { mu i , Σ i , f i , α i }, and can be directly used for real-time rendering and new view angle synthesis.
- 2. The sparse view three-dimensional Gaussian reconstruction method based on asymmetric bimodal regularization and opacity perturbation according to claim 1, wherein in step S2, initialization parameters satisfy position μ i = p i , covariance Σ i = σ 0 I 3 , wherein σ 0 ε [0.01, 0.05], color f i = I v (p i ), opacity α i =0.1.
- 3. The sparse view three-dimensional Gaussian reconstruction method based on asymmetric bimodal regularization and opacity perturbation according to claim 1, wherein in step S3, G 0 is checked every 100 steps and G 1 is checked every 200 steps in asymmetric densification and pruning.
- 4. The sparse view three-dimensional Gaussian reconstruction method based on asymmetric bimodal regularization and opacity perturbation of claim 1, wherein in step S3, the dynamic threshold η_geo (T) is defined as η_geo (T) =η 0 (1-T/T_total) where η 0 = 1.0e-3,T_total = 10 4 in collaborative geometry filtering.
- 5. The sparse view three-dimensional Gaussian reconstruction method based on asymmetric double-model regularization and opacity disturbance according to claim 1, wherein in step S3, in order to avoid misleading optimization of invalid regions in depth consistency constraint, an effective mask is constructed, wherein M v (p) =1, 1 is taken when τ_low is equal to D 1 (p) is equal to τ_high, otherwise, 0 is taken, and loss is finally calculated only on an effective region Ω v = {p | M v (p) =1 } :L_depth^(0) = (1 / |Ω v |) ∑ p ∈Ω v |D 0 (p) - D 1 (p)|;L_depth^(1) = (1 / |Ω|) ∑ p |D 0 (p) - D 1 (p)|;λ 0 (t) = 0.1·exp(-t / 3000),λ 1 (t) = 0.5λ 0 (t).
- 6. The sparse view three-dimensional Gaussian reconstruction method based on asymmetric bimodal regularization and opacity perturbation according to claim 1, wherein in step S4, the perturbation probability p_reset is gradually increased from 0 to 0.3 in the random perturbation, and the increase strategy p_reset (T) =0.3 (T/T_total).
- 7. The sparse view three-dimensional Gaussian reconstruction method based on asymmetric bimodal regularization and opacity perturbation of claim 1, wherein in step S4, the operation is performed every 500 steps in the global opacity constraint to limit the dominance of a single Gaussian in the rendering process.
- 8. The sparse view three-dimensional Gaussian reconstruction method based on asymmetric bimodal regularization and opacity perturbation according to claim 1, wherein in step S4, after all losses are integrated in the perturbed photometric reconstruction loss calculation, the system optimization objective is L_total=L_photo+λ 0 L_depth + λ 1 L_OPR, where λ 0 = 0.05,λ 1 =0.01 is a constant.
Description
Sparse view three-dimensional Gaussian reconstruction method based on asymmetric double-model regularization and opacity disturbance Technical Field The invention relates to a sparse view three-dimensional Gaussian reconstruction method based on asymmetric double-model regularization and opacity disturbance, and belongs to the technical field of three-dimensional reconstruction and computer vision. Background In recent years, three-dimensional gaussian rendering (3D Gaussian Splatting, 3 DGS) has become a new paradigm for high quality new view synthesis (Novel VIEW SYNTHESIS). Unlike the NeRF model based on volume rendering, 3DGS achieves a balance of real-time rendering and high-fidelity reconstruction by representing the scene as a set of differentiable gaussian distributions. This advantage makes it very promising for applications in the fields of virtual reality (VR/AR), robot vision, digital content generation, etc. However, the excellent performance of 3DGS often depends on dense or multi-view input conditions, and this dependence on the observation density limits its usability in real scenes. Observations in real environments tend to be sparse, for example, data acquired by mobile phone hand-held photography or unmanned aerial vehicle fast aerial photography typically only contains a limited viewing angle. Under the condition, the 3DGS is easy to be overfitted to a training visual angle, namely, although the model can generate a clear rendering image at the training visual angle, a large amount of floating artifacts (floating artifacts) can be generated at the invisible visual angle, so that the rendering result is fuzzy, the ghost is serious, the geometric structure is disintegrated, and the practical application requirement is difficult to be met. To alleviate this problem, the existing work is mostly improved from both geometric and appearance directions. Geometry-oriented methods (such as CoR-GS, DNGaussian) stabilize geometry by introducing dual-model consistency and depth regularization, loopSparseGS and SparseGS reduce gaussian floating by depth alignment and unobserved view regularization, improving geometry consistency. While the appearance-oriented approach focuses on rendering consistency and color stabilization, e.g., dropGaussian randomly discarding part of the gaussian in training to break the overfit, dropoutGS promotes color consistency based on the probability dropout of gradient distribution, augGS constrains the color distribution to enhance cross-view stability, VGNC dynamically adjusts the gaussian density to maintain training robustness. In addition, some work (e.g., sparseGS, free, loopSparseGS, FSGS, DNGaussian, MVPGS, S, gaussian, etc.) introduces external geometric priors to mitigate structural degradation from input sparsity. Despite the advances made by these methods, the 3DGS at sparse viewing angles still generally has the problem of artifact residues inconsistent with cross-viewing angle geometry, and it is difficult to combine both geometric stability and high-fidelity rendering. We have found through extensive analysis that discomfort in photometric error at sparse viewing angles is the root cause of the artifacts. The 3DGS learns scene representations by minimizing photometric consistency loss, and when the input views are sufficiently dense, the overlap region between multiple views can indirectly constrain the geometry. However, under sparse viewing conditions, photometric supervision is only effective in a limited direction, and any gaussian that reduces errors under training viewing is preserved and optimized even though it does not correspond to a real surface. Due to lack of geometric consistency constraints, these floating artifacts can continue to accumulate, causing depth discontinuities and structural shifts. Experiments have verified that under dense inputs 3DGS can generate a coherent geometry, while under sparse inputs artifacts accumulate and deviate from the true surface gradually as training iterations progress. Furthermore, we found that the opacity distribution of floating artifacts is highly coupled with the effective structure gaussian, both evolving synchronously during training, resulting in an inability of opacity-based pruning (opacity pruning) to effectively separate artifacts from the real structure. This problem arises from the lack of geometric priors in photometric supervision, which causes artifacts to become entangled with real gaussians in optimization. Based on the method, the invention provides a sparse view three-dimensional Gaussian reconstruction method based on asymmetric double-model regularization and opacity disturbance. Disclosure of Invention In view of the above, the invention provides a sparse view three-dimensional Gaussian reconstruction method based on asymmetric dual-model regularization and opacity disturbance, which is characterized in that the asymmetric dual-model regularization frame and the opacity disturbance re