CN-122023742-A - Gaussian acceleration method, device and storage medium for structure priori and directional growth
Abstract
The application discloses a Gaussian acceleration method, equipment and storage medium for structure priori and directional growth, which comprise the steps of obtaining a first point cloud and camera pose obtained by a motion recovery structure and a feedforward structure estimation model to obtain a second point cloud and depth priori, carrying out three-dimensional similar transformation alignment on the second point cloud and the first point cloud, generating an initial Gaussian set through denoising and redundancy suppression on a transformation alignment result, calculating the comprehensive score of each Gaussian based on the difference between the priori depth and the rendering depth of each Gaussian of the initial Gaussian set, image projection characteristics and preset confidence, sorting the gauss according to the comprehensive score to select target gauss, determining rendering resolution based on the entropy value of a training image set, and carrying out point increasing operation on the target gauss to optimize the three-dimensional Gaussian model. The application realizes the technical effects of training acceleration and model quality improvement by fusing structure prior, comprehensive scores and self-adaptive scheduling, and cooperatively controlling initialization, growth direction and complexity.
Inventors
- WANG RONGGANG
- YANG YUCHENG
- GAO WEN
- ZHANG JIAN
Assignees
- 北京大学深圳研究生院
Dates
- Publication Date
- 20260512
- Application Date
- 20251226
Claims (10)
- 1. A structure priori and directionally grown gaussian acceleration method, characterized in that the structure priori and directionally grown gaussian acceleration method comprises the steps of: acquiring a first point cloud and a camera gesture obtained based on motion restoration structure reconstruction, and a second point cloud and a depth priori obtained by a feedforward structure estimation model; performing three-dimensional similar transformation alignment on the second point cloud and the first point cloud, and generating an initial Gaussian set after denoising and redundancy inhibition on a transformation alignment result, wherein a three-dimensional Gaussian model is constructed through the initial Gaussian set; Calculating a comprehensive score of each Gaussian based on the difference between the prior depth and the rendering depth of each Gaussian in the initial Gaussian set, the image projection characteristics and the preset confidence; Sorting the gauss according to the comprehensive score of each gauss, and selecting target gauss from the sorting result; acquiring a training image set, and determining rendering resolution based on entropy values of the training image set; And performing point adding operation on the target point based on the point adding amount budget corresponding to the rendering resolution so as to optimize the three-dimensional Gaussian model.
- 2. The structure prior and directionally grown gaussian acceleration method according to claim 1, characterized in that the step of aligning the second point cloud with the first point cloud in a three-dimensional similarity transformation comprises: calculating a three-dimensional similarity transformation matrix of the second point cloud transformed to the first point cloud coordinate system by utilizing Umeyama algorithm; And carrying out coordinate transformation on the second point cloud according to the transformation matrix, so that the second point cloud is spatially aligned with the first point cloud.
- 3. The structure prior and directionally grown gaussian acceleration method according to claim 1, characterized by the step of generating an initial gaussian set of transform alignment results after denoising and redundancy suppression, comprising: calculating nearest neighbor distances between each point in the second point cloud and all points in the first point cloud, determining a target point with the nearest neighbor distance larger than a preset threshold value, and removing the target point; And performing self-adaptive voxel grid sampling on the residual points after the target point is removed, and extracting the points with preset points from the sampling result to generate the initial Gaussian set.
- 4. The structure prior and directional growth gaussian acceleration method according to claim 1, characterized by the step of calculating a composite score for each of said gauss based on the difference between the prior depth and the rendering depth of each of said gauss in said initial set of gauss, the image projection characteristics and a preset confidence level, comprising: acquiring a priori depth value of the Gaussian, and performing Gaussian rendering on the Gaussian to obtain the depth value; calculating a normalized difference value between the prior depth value and the depth value, and taking the normalized difference value as a depth difference measurement; Acquiring image projection characteristics of the Gaussian at least one training view angle, wherein the image projection characteristics comprise at least one of projection area, shape parameters, transparency and visible weight; and carrying out fusion calculation on the depth difference measurement, the quantized result of the image projection characteristic quantization and the preset confidence coefficient to obtain the comprehensive score.
- 5. The method of gaussian acceleration for structure priori and directional growth according to claim 4, wherein the step of performing fusion calculation on the depth difference metric, the quantized result of the image projection feature quantization, and the preset confidence level to obtain the composite score includes: Respectively carrying out normalization processing on multiple characteristics in the image projection characteristics, and carrying out weighted summation to obtain a comprehensive image contribution value; Acquiring a preset confidence coefficient of a space region where the gauss is located; And calculating the product of the depth difference measurement, the comprehensive image contribution value and the confidence coefficient to obtain the comprehensive score.
- 6. The structure prior and directional growth gaussian acceleration method according to claim 1, characterized in that said step of obtaining a training image set and determining a rendering resolution based on entropy values of said training image set comprises: Calculating information entropy of each training image in the training image set, and obtaining a comprehensive entropy value corresponding to each information entropy, wherein the comprehensive entropy value is characterized by training content complexity; Mapping the comprehensive entropy value into a corresponding rendering resolution level according to a preset corresponding relation between complexity and resolution level; And determining the rendering resolution according to the rendering resolution level.
- 7. The structure prior and directional growth gaussian acceleration method according to claim 1, characterized in that the step of budgeting by the number of increase points corresponding to the rendering resolution comprises: counting the Gaussian quantity of the initial Gaussian set, and determining a rendering resolution level corresponding to the rendering resolution; Based on the Gaussian quantity and the rendering resolution level, calculating and obtaining the maximum Gaussian quantity allowed to be increased in the current training stage of the three-dimensional Gaussian model according to a preset functional relation; And taking the maximum Gaussian quantity as the increment quantity budget.
- 8. The structure prior and directional growth gaussian acceleration method according to claim 1, characterized in that the step of performing a point-adding operation on the target gaussian based on the point-adding number calculation to optimize the three-dimensional gaussian model comprises: Determining an operation strategy according to the aggregate characteristics of the target gauss and the image projection state; processing the target gauss based on the operation strategy to obtain one or more new gauss, wherein the operation strategy comprises cloning operation and splitting operation, if the operation strategy is cloning operation, generating new gauss similar to the parameters of the target gauss at the adjacent space positions of the target gauss, if the operation strategy is splitting operation, dividing the target gauss into gauss with a plurality of sizes, and deploying the parameters of the target gauss to the gauss to obtain the new gauss; the new gaussian is added to the initial gaussian set to optimize the three-dimensional gaussian model.
- 9. A structure-priori and directionally grown gaussian acceleration device, characterized in that it comprises a memory, a processor and a computer program stored on said memory and executable on said processor, said computer program being configured to implement the steps of the structure-priori and directionally grown gaussian acceleration method according to any of the claims 1 to 8.
- 10. A storage medium, characterized in that the storage medium is a computer readable storage medium, on which a computer program is stored, which computer program, when being executed by a processor, implements the steps of the structured a priori and directionally grown gaussian acceleration method according to any of the claims 1 to 8.
Description
Gaussian acceleration method, device and storage medium for structure priori and directional growth Technical Field The application relates to the technical field of the intersection of computer vision and computer graphics, in particular to a Gaussian acceleration method, gaussian acceleration equipment and a Gaussian acceleration storage medium for structure priori and directional growth. Background The three-dimensional gaussian point rendering technique (3D Gaussian Splatting) has become an important method in the field of new view angle synthesis by representing scenes with explicit gaussian point sets and optimizing with micro-renderable. This technique typically relies on the initialization of a sparse point cloud acquired by a motion restoration structure, followed by refinement of scene geometry and appearance through repeated upscaling and back propagation during training. However, the process is subjected to deep constraint related to efficiency, namely sparsity of initial point clouds in weak textures or long-distance areas, so that a model is difficult to effectively cover a complete scene in early training, a large number of iterations are required to be completed, a commonly adopted point increasing mechanism based on local gradient judgment is difficult to effectively distinguish areas which really need detail reconstruction from saturated high gradient areas, invalid distribution of computing resources and redundancy of the model are easy to cause, meanwhile, the whole training process lacks global constraint on complexity increase of the model, and computing cost is increased uncontrollably along with expansion of Gaussian quantity and improvement of rendering resolution, so that practicability and expandability of the method are directly influenced. For these efficiency bottlenecks, research has been attempted to seek improvements from different sides, such as reducing redundant gauss by pruning, optimizing rendering with underlying computational libraries, or adjusting training cadence with a phasing strategy. However, such improvements tend to focus on a single link, failing to fundamentally build a synergistic optimization framework. The three problems of insufficient initialization, blindness in the increasing direction and uncontrolled complexity are interwoven and mutually influenced, so that the overall breakthrough of training efficiency is difficult to realize by any local optimization. Therefore, a comprehensive solution capable of systematically performing the whole training process, uniformly scheduling the initialization quality, increasing the point direction and calculating the budget is urgently needed by further development of the three-dimensional Gaussian point rendering technology. In this context, research in the art has focused on how to design an acceleration framework with a global field of view. The framework not only needs a high-quality structure prior to lay a good training starting point, but also needs an intelligent mechanism to guide the directional flow of the point increasing resources to the area needing perfection, and has self-adaptive scheduling capability to strictly manage the calculation complexity of the whole training process, so that the training efficiency and the practicability of the model are greatly improved on the premise of ensuring the reconstruction precision. The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present application and is not intended to represent an admission that the foregoing is prior art. Disclosure of Invention The application mainly aims to provide a Gaussian acceleration method, gaussian acceleration equipment and a storage medium for structure priori and directional growth, and aims to solve the technical problems of low efficiency and low quality of the existing 3D Gaussian training. In order to achieve the above object, the present application provides a gaussian acceleration method for structure prior and directional growth, said method comprising: acquiring a first point cloud and a camera gesture obtained based on motion restoration structure reconstruction, and a second point cloud and a depth priori obtained by a feedforward structure estimation model; performing three-dimensional similar transformation alignment on the second point cloud and the first point cloud, and generating an initial Gaussian set after denoising and redundancy inhibition on a transformation alignment result, wherein a three-dimensional Gaussian model is constructed through the initial Gaussian set; Calculating a comprehensive score of each Gaussian based on the difference between the prior depth and the rendering depth of each Gaussian in the initial Gaussian set, the image projection characteristics and the preset confidence; Sorting the gauss according to the comprehensive score of each gauss, and selecting target gauss from the sorting result; acquiring a training image