CN-122023677-A - Building three-dimensional reconstruction method based on multi-source data alignment

CN122023677ACN 122023677 ACN122023677 ACN 122023677ACN-122023677-A

Abstract

The invention relates to the field of building three-dimensional reconstruction, in particular to a multi-source data alignment-based building three-dimensional reconstruction method which comprises the following steps of S1, obtaining building three-dimensional characteristic data shot by a plurality of unmanned aerial vehicles, carrying out multi-source collaborative alignment processing on the building three-dimensional characteristic data to obtain aligned building three-dimensional characteristic data, S2, carrying out recursion spatial segmentation on the aligned building three-dimensional characteristic data by using an octree algorithm to obtain a plurality of segmentation areas, S3, defining each segmentation area as a local center point by using tile type spatial decomposition and generating tile units, S4, extracting characteristic points with SDF value of 0 from a global symbol distance field by using a movable cube algorithm, constructing a continuous geometric surface of a building by using linear interpolation, scaling a model to a real scale by using distance references provided by an RTK-GNSS system of the unmanned aerial vehicles, and outputting a reconstruction model, and breaking through the limitation of a visual rendering effect of the traditional three-dimensional reconstruction model.

Inventors

LUO WEI
LIAN MINGCHANG
WANG YAOZONG
FAN RENHAO
ZHENG JIESHENG
WANG WEI
WANG SENLIN
DAI LINGFENG
CHEN HAO
CHEN SONGHANG
ZHANG JIANMING

Assignees

泉州装备制造研究所
中国科学院福建物质结构研究所

Dates

Publication Date: 20260512
Application Date: 20260414

Claims (3)

1. The three-dimensional reconstruction method for the building based on the multi-source data alignment is characterized by comprising the following steps of: S1, acquiring three-dimensional feature data of buildings shot by a plurality of unmanned aerial vehicles, performing multi-source collaborative alignment treatment on the three-dimensional feature data of the buildings to obtain aligned three-dimensional feature data of the buildings, S2, performing recursion space segmentation on the aligned three-dimensional feature data of the building by utilizing an octree algorithm to obtain a plurality of segmentation areas, S3, defining each segmentation area as a local center point by adopting tile type space decomposition and generating tile units, wherein the specific steps of generating the tile units are as follows: S3-1, a neural network is deployed for each tile unit to represent a local nerve symbol distance field of each tile unit, each local nerve symbol distance field forms a global symbol distance field, the neural network comprises a reflection multilayer perceptron and a surface multilayer perceptron, the tile units are classified according to the correlation between training light and a visual angle, SDF values and color information in the current tile units are respectively input into the reflection multilayer perceptron and the surface multilayer perceptron to learn, the tile units related to the visual angle output visual angle-dependent color and volume density information through the reflection multilayer perceptron, the tile units unrelated to the visual angle output visual angle-independent color and volume density through the surface multilayer perceptron, and meanwhile, the adjacent tile units share the weight of the reflection multilayer perceptron for eliminating tile boundary artifacts; S3-2 controlling the contribution degree between adjacent tile units by using weight function Thereby ensuring smooth switching at each tile element boundary, as shown in formula (4): (4); Wherein, the For the sampling point-to-boundary distance, As a smoothing factor, the smoothing factor is used, Representing tile boundaries; the color and density of the current sampling point are calculated by carrying out weighted superposition on the predicted value of each tile unit, and the core fusion formula is shown as (5): (5); Wherein, the Representing the bulk density of the i-th tile, Representing the color of the i-th tile, And finally, bringing the fused color and density into a rendering equation, and rendering the three-dimensional reconstruction model, wherein the rendering equation is shown in a formula (6): (6); Wherein the transmittance is Reflecting the cumulative occlusion effect of all material on the path from the sample point to the camera, Representing the change in three-dimensional space coordinates over time t, Representing the direction of observation; and S4, extracting characteristic points with the SDF value of 0 from the global symbol distance field by using a moving cube algorithm, constructing a continuous geometric surface of the building by using linear interpolation, scaling the model to a real proportion by using a distance reference provided by an RTK-GNSS system of the unmanned aerial vehicle, and outputting a reconstructed model.
2. The method for three-dimensional reconstruction of a building based on multi-source data alignment of claim 1, wherein the specific multi-source collaborative alignment processing step of step S1 comprises the following steps: s1-1, converting building image data shot by a plurality of unmanned aerial vehicles into the same world coordinate system by using pose information provided by an unmanned aerial vehicle-mounted RTK-GNSS system, thereby establishing a coarse alignment relationship among multi-source data; The specific operation is as follows, for building image In (3) pixels Which is in contact with three-dimensional space points Following a projected linear model, as shown in equation (1): (1); Wherein the scale factor Represents depth information of the unmanned aerial vehicle in a coordinate system, Is a camera external reference matrix obtained by an airborne RTK-GNSS system, and a camera internal reference matrix Defining the geometric relationship between the optical center of the camera and the imaging plane; S1-2, performing feature matching on an overlapped area between images by combining a joint feature cost function of visual features and semantic information, and combining the feature cost function As shown in formula (2): (2); Wherein, the Representing the pixel points in the two images, Representing the pixel feature vector extracted by the feature extraction operator, For a pixel class label acquired through a semantic segmentation network, A weight coefficient for adjusting the contribution ratio of the texture feature to the semantic constraint; S1-3 using distance perception factors The error of the re-projection is adjusted, reprojection error As shown in formula (3): (3); Wherein, the Represent the first Pixel position of a sheet image extracted by feature matching , Representing pixel location A position in a three-dimensional world coordinate system, Representing Huber loss function, projection operator Through unmanned aerial vehicle's internal reference matrix Sum-outer parameter matrix Calculating the space point In the first place Theoretical pixel position, distance sensing factor in a sheet image The shooting distance is dynamically adjusted along with the shooting distance, Indicating the i-th image capturing distance, The minimum photographing distance is indicated and the image capturing device, Representing probability distribution parameters; s1-4, establishing photometric offset mapping between images under an implicit ambient light mapping field; S1-5, evaluating the observation quality through the back projection residual error, automatically cleaning the dynamic shielding and the strong reflection interference, calculating the variance of the space point under all observation visual angles, judging the point as the dynamic interference or the instantaneous strong reflection point if the variance exceeds a threshold value, and removing the point from the optimized weight.
3. The three-dimensional reconstruction method of building based on multi-source data alignment according to claim 2, wherein the specific segmentation step of step S2 is as follows: S2-1, calculating maximum values and minimum values of the three-dimensional characteristic data of the aligned building in the X, Y and Z directions, and constructing a minimum axis alignment bounding box capable of containing all data according to the maximum values and the minimum values; s2-2, setting a recursion termination condition, namely presetting a minimum data volume of tiles, a minimum size of tiles and a maximum depth of tiles; S2-3, starting recursive space segmentation from a minimum axial directional bounding box, namely finding a center point of a current cube, dividing the current space into 8 sub-cubes through three mutually perpendicular planes, and distributing characteristic data in a father node into the 8 sub-nodes; and S2-4, the node is updated, namely if the child node is empty, the child node is deleted directly, if the child node meets the termination condition, the child node is marked as a leaf node, the building data index in the area is stored, if the child node does not meet the termination condition, the child node is used as a new father node, and the step S2-3 is returned to continue recursion.

Description

Building three-dimensional reconstruction method based on multi-source data alignment Technical Field The invention belongs to the field of three-dimensional reconstruction of buildings, and particularly relates to a three-dimensional reconstruction method of a building based on multi-source data alignment. Background The building three-dimensional reconstruction refers to a process of converting building entities of the physical world into a computer-processable digital three-dimensional model by using remote sensing, computer vision and graphics technologies. The method is characterized in that the geometric structure, the surface texture and the semantic information of the building are obtained by utilizing sensor data such as images, depth or ground data, and the like, and the method is often used for building scenes with large scale span and complex optical characteristics such as historic building protection and engineering monitoring. The traditional three-dimensional reconstruction of the building is based on a multi-view geometric technology, dense matching is carried out on the basis of known pose, high-density point cloud is generated, and the three-dimensional structure of the building is extracted by using methods such as poisson reconstruction. Because the traditional three-dimensional reconstruction method extremely depends on characteristic point matching of images, the phenomenon of structural distortion or hollowness is often shown for parts which are common in buildings and cannot stably extract characteristic points, such as glass, mirror surfaces, steel structures, pure-color curtain walls and the like. Second, when dealing with elongated structures, edges tend to become rounded and blurry due to the effects of matching noise and smoothing algorithms, failing to restore sharp architectural edges. In recent years, with the deep fusion of Unmanned Aerial Vehicle (UAV) technology and computer vision, a three-dimensional building reconstruction technology based on a nerve radiation field (Neural RADIANCE FIELDS, NERF) gradually replaces the traditional scheme by virtue of continuous representation and sub-pixel precision, and provides a new solving path for building digital modeling. However, when performing high-precision three-dimensional reconstruction of multiple unmanned aerial vehicles for buildings with complex illumination conditions, the existing nerve radiation field technology has the following defects: 1. The multi-source data fusion is difficult, namely, in the actual working process, due to the differences of shooting time, illumination conditions and sensor parameters of different unmanned aerial vehicles, serious luminosity inconsistency exists in an overlapped observation area, and ghost or artifact phenomenon is shown in the reconstruction process. 2. The large-scale building modeling is difficult, and the traditional neural radiation field three-dimensional reconstruction method adopts a single neural network to represent an implicit body density field model of a building three-dimensional model. In the face of kilometer-level urban building scenes, a single network is difficult to fit massive space details, the demand of training video memory is exponentially increased, the convergence speed is extremely low, and the demand of industrialized rapid delivery cannot be met. 3. Geometric representation lacks the ability to physically measure that the neural radiation field technique generates a three-dimensional reconstruction model that is a continuous surface structure represented by an implicit volume density field, lacks explicit geometric boundaries while being noisy. This results in the reconstructed three-dimensional model being difficult to convert into quantitative three-dimensional data having physical dimensions (area, volume, length, etc.), and not directly applicable to subsequent construction operations. Disclosure of Invention The invention aims to provide a three-dimensional reconstruction method for a building based on multi-source data alignment, which improves mapping accuracy. In order to achieve the above purpose, the present invention adopts the following technical scheme: A three-dimensional reconstruction method for a building based on multi-source data alignment comprises the following steps of: S1, acquiring three-dimensional feature data of buildings shot by a plurality of unmanned aerial vehicles, performing multi-source collaborative alignment treatment on the three-dimensional feature data of the buildings to obtain aligned three-dimensional feature data of the buildings, S2, performing recursion space segmentation on the aligned three-dimensional feature data of the building by utilizing an octree algorithm to obtain a plurality of segmentation areas, S3, defining each segmentation area as a local center point by adopting tile type space decomposition and generating tile units, wherein the specific steps of generating the tile units are as foll