Search

CN-122023490-A - Method for aligning visual angles of images on construction site

CN122023490ACN 122023490 ACN122023490 ACN 122023490ACN-122023490-A

Abstract

The invention relates to the technical field of building information processing, and provides a construction site image visual angle alignment method, which can construct a multi-visual angle rendering library covering a typical inspection angle according to stable structural characteristics, avoid matching failure caused by visual angle omission, solve the problem of cross-domain alignment instability of BIM and construction site images, perform rough pose estimation according to two-dimensional structural characteristics and the multi-visual angle rendering library, optimize the rough pose according to a cross-domain reprojection error objective function to obtain a refined pose, and effectively shield abnormal characteristic interference caused by temporary shielding by combining shielding Lu Bangquan heavy and a robust loss function, and improve accuracy of visual angle alignment.

Inventors

  • ZHANG ERQING
  • WANG SHUNJIE

Assignees

  • 杭州浩联智能科技有限公司

Dates

Publication Date
20260512
Application Date
20260413

Claims (10)

  1. 1. The construction site image visual angle alignment method is characterized by comprising the following steps of: Responding to an image view angle alignment trigger signal of a target construction site, acquiring site image data of the target construction site and stable structural features of each type of BIM components in a corresponding BIM model, wherein the stable structural features can be used for registration; Constructing a multi-view rendering library according to the stable structural characteristics; Processing the field image data to obtain two-dimensional structural features and a non-interpretation domain mask; performing coarse pose estimation according to the two-dimensional structural features and the multi-view rendering library to obtain coarse poses; Reconstructing a cross-domain re-projection error objective function based on the shielding Lu Bangquan, and optimizing the coarse pose according to the cross-domain re-projection error objective function to obtain a refined pose.
  2. 2. The job site image perspective alignment method as set forth in claim 1, wherein the constructing a multi-perspective rendering library from the stable structural features includes: Under a BIM coordinate system, configuring inspection shooting range constraint according to the construction area of the target construction site and parameters of inspection equipment; Restricting the inspection shooting range in the construction area, and performing space discretization sampling based on the geometric topological relation of the construction area to obtain a plurality of candidate machine sites; Generating a sampling area by taking each candidate machine site as a center and taking a preset length as a radius; Directing a camera sight line to the geometric center of the construction area at each candidate machine point, and sampling a plurality of deflection angles in a sampling area corresponding to each candidate machine point to obtain a plurality of preliminary candidate camera poses; performing visibility screening on the plurality of preliminary candidate camera poses according to the stable structural features to obtain a plurality of intermediate candidate camera poses; Calculating the score of each intermediate candidate camera pose according to a preset dimension, and acquiring the intermediate candidate camera poses with the scores larger than a preset score threshold from the plurality of intermediate candidate camera poses as a plurality of candidate camera poses; Rendering each candidate camera pose to obtain a multi-mode priori of each candidate camera pose; And constructing the multi-view rendering library according to the multi-mode priori of the pose of each candidate camera.
  3. 3. The job site image perspective alignment method as set forth in claim 2, wherein the constructing the multi-perspective rendering library from the multi-modal priors of each candidate camera pose comprises: For each candidate camera pose, acquiring a rendering edge map from a multi-mode priori of the candidate camera pose, and extracting features of the rendering edge map to obtain a structural edge descriptor; Acquiring a semantic instance mask from the multi-mode priori of the candidate camera pose, and extracting features of the semantic instance mask to obtain a semantic layout vector; splicing the structure edge descriptors and the semantic layout vectors to obtain splicing features; Normalizing the spliced features to obtain a first comprehensive feature vector of each candidate camera pose; obtaining the view angle number of each candidate camera pose; and storing the first comprehensive feature vector of each candidate camera pose, each candidate camera pose and the serial number of each candidate camera pose into a vector database, and establishing an index by using a vector retrieval structure to obtain the multi-view rendering library.
  4. 4. The method of aligning visual angles of construction site images according to claim 2, wherein the processing the site image data to obtain two-dimensional structural features comprises: Performing pixel-level semantic segmentation on the field image data to obtain a segmentation result of the structural significant component; And positioning an extraction area of the field image data according to the segmentation result to serve as a stable structure area, and extracting two-dimensional component features corresponding to the stable structure features from the stable structure area to obtain the two-dimensional structure features.
  5. 5. The method of aligning visual angles of images on a construction site according to claim 4, wherein the estimating a coarse pose according to the two-dimensional structural features and the multi-visual angle rendering library comprises: generating a second comprehensive feature vector of the two-dimensional structural feature; Searching in the multi-view rendering library by utilizing the second comprehensive feature vector to obtain a plurality of candidate BIM view angles and image feature similarity corresponding to each candidate BIM view angle; Converting the coarse pose information output by the inspection equipment from a sensor world coordinate system to a BIM coordinate system according to a coordinate conversion relation calibrated in advance to obtain the coarse pose of the sensor under the BIM coordinate system; acquiring the camera pose of each candidate BIM visual angle; For each candidate BIM visual angle, calculating the Euclidean distance of a position vector between the camera pose and the sensor coarse pose to obtain a position difference, and calculating the included angle between the camera pose and the sensor coarse pose in the optical axis direction to obtain an orientation difference; Linearly combining the position difference and the orientation difference according to a first preset weight to obtain a sensor consistency distance; converting the sensor consistency distance into a sensor consistency score; Carrying out normalization processing on the image feature similarity corresponding to each candidate BIM view angle to obtain an image similarity score corresponding to each candidate BIM view angle; weighting and fusing the sensor consistency score corresponding to each candidate BIM view angle and the image similarity score according to a second preset weight to obtain a comprehensive score of each candidate BIM view angle; And acquiring the camera pose corresponding to the candidate BIM view angle with the highest comprehensive score from the plurality of candidate BIM view angles as the coarse pose.
  6. 6. The job site image perspective alignment method as set forth in claim 5, wherein reconstructing the cross-domain re-projection error objective function based on the occlusion Lu Bangquan comprises: Establishing a corresponding relation between BIM three-dimensional semantic geometric features and image two-dimensional semantic features according to the stable structural features, the segmentation result of the structural significant components, the two-dimensional structural features and the non-interpretation domain mask in a preset deviation range of the rough pose; Constructing the occlusion Lu Bangquan weights including non-interpreted domain weights, stable structure region weights, and robust loss functions; Calculating an edge distance error, a mask intersection ratio error and a consistency error according to the corresponding relation by taking the shielding Lu Bangquan weight as a constraint; Acquiring a first error corresponding to the edge distance error, a second error corresponding to the mask intersection ratio error and a third error corresponding to the consistency error; weighting calculation is carried out on the edge distance error, the mask intersection ratio error and the consistency error according to the first error, the second error and the third error, so that the cross-domain reprojection error objective function is obtained; wherein features falling within the non-interpreted domain mask are rejected or given low weight by the non-interpreted domain weights; the visible stable structure region is given high weight through the weight of the stable structure region; the edge distance error represents the distance error between the BIM projection edge and the image semantic edge, the mask intersection ratio error represents the intersection ratio error between the BIM semantic mask projection and the image segmentation mask, and the consistency error represents the consistency error between the rendering depth and the image estimated depth.
  7. 7. The method of aligning visual angles of construction site images according to claim 6, wherein optimizing the coarse pose according to the cross-domain reprojection error objective function to obtain a refined pose comprises: taking the value of the target function for minimizing the cross-domain reprojection error as a target, and carrying out gradient descent on the parameter vector corresponding to the coarse pose until convergence; And generating the refined pose according to the currently obtained parameter vector.
  8. 8. The method for aligning visual angles of construction site images according to claim 7, wherein after the refined pose is obtained, the method further comprises: The target confidence of the refined pose is calculated by adopting the following formula: ; Wherein, the Alpha represents an attenuation coefficient; representing a residual average value obtained after residual distribution statistics of the value of the cross-domain reprojection error objective function in the stable structure region; Representing a variance penalty coefficient; representing residual variance obtained after residual distribution statistics of the value of the cross-domain reprojection error objective function in the stable structure region; when the target confidence coefficient is smaller than a confidence coefficient threshold value, the refined pose is self-corrected, and a corrected pose is obtained; And generating an image visual angle alignment result of the target construction site according to the corrected pose.
  9. 9. The method of aligning visual angles of construction site images according to claim 8, wherein the self-correcting the refined pose to obtain a corrected pose comprises: When the field image data is a video, backtracking to adjacent frames of the field image data to be reprocessed to obtain the two-dimensional structural feature and the non-interpretation domain mask, or when the field image data is a single frame, acquiring a camera pose corresponding to the candidate BIM visual angle with high comprehensive score from the plurality of candidate BIM visual angles as the rough pose after backtracking; And determining the current refined pose as the corrected pose until the corresponding target confidence coefficient is detected to be greater than or equal to the confidence coefficient threshold value or the maximum backtracking times are reached.
  10. 10. The job site image perspective alignment method of claim 9, wherein the generating the image perspective alignment result of the target job site according to the corrected pose comprises: Re-rendering the BIM model based on the corrected pose to obtain a visual angle alignment rendering chart consistent with the visual angle of the field image data; generating an alignment availability mark according to the corrected pose; And combining the corrected pose, the visual angle alignment rendering diagram, the detected target confidence and the alignment availability mark to obtain the visual angle alignment result.

Description

Method for aligning visual angles of images on construction site Technical Field The invention relates to the technical field of building information processing, in particular to a visual angle alignment method for a construction site image. Background Currently, in an intelligent construction inspection scene, a common practice is to compare a photo or video shot at a construction site with a BIM (Building Information Modeling, building information model) model to determine that a component is missing, misplaced or not normally constructed. To achieve reliable contrast, the real field image and the BIM rendering result must be at the same view angle, i.e. the precise pose (including position, orientation and internal reference) of the field camera relative to the BIM coordinate system must be solved. The currently adopted viewing angle alignment scheme mainly comprises the following steps: (1) Based on manual selection or manual registration, the patrol personnel manually select characteristic points in the field image and register the characteristic points with corresponding points in BIM or CAD (Computer-AIDED DESIGN, computer aided design). The method has low efficiency, strong subjectivity and poor error proofing, and is not suitable for large-scale automatic inspection; (2) A method based on general vision SLAM (Simultaneous Localization AND MAPPING ) or three-dimensional reconstruction comprises the steps of utilizing field video to carry out SLAM or SfM (Structure from Motion, recovering a structure from motion) to obtain a camera track, and then roughly aligning with BIM. The method relies on texture and stable illumination, and the scene of the construction site is seriously blocked, dust, reflected light and repeated in structure (such as column net and beam net), and is easy to drift or fail; (3) The method based on image retrieval or pure geometric ICP (ITERATIVE CLOSEST POINT ) is to retrieve the approximate view angle in the pre-rendered BIM view angle library and then do local optimization. However, since BIM and real field images belong to cross-domain data (i.e., geometric rendering and real pixels), mismatching often occurs directly depending on pixel similarity or point cloud ICP; (4) The alignment result lacks reliability assessment, that is, a pose given by most methods is directly used for subsequent difference detection, and an automatic recognition and correction mechanism for alignment failure caused by shielding and temporary objects (such as scaffolds, templates, machines and people) is lacking, so that large-area false alarm is caused. In summary, the major problems or drawbacks of the existing viewing angle alignment schemes mainly include: (1) The cross-domain alignment is unstable, BIM has large difference with the appearance of the image, and the general characteristics are difficult to stably match; (2) Shielding sensitivity, namely, pose solving deviation is caused by shielding of temporary objects or construction process, and automatic identification is difficult; (3) The lack of confidence and self-correction closed loop-failure of alignment does not trigger repositioning, affecting the reliability of subsequent violation detection. Disclosure of Invention In view of the above, it is necessary to provide a method for aligning visual angles of construction sites, which aims to solve the problem of low accuracy of visual angle alignment for construction sites with complex environmental conditions such as shielding, illumination change, structural repetition and the like. A job site image perspective alignment method, the job site image perspective alignment method comprising: Responding to an image view angle alignment trigger signal of a target construction site, acquiring site image data of the target construction site and stable structural features of each type of BIM components in a corresponding BIM model, wherein the stable structural features can be used for registration; Constructing a multi-view rendering library according to the stable structural characteristics; Processing the field image data to obtain two-dimensional structural features and a non-interpretation domain mask; performing coarse pose estimation according to the two-dimensional structural features and the multi-view rendering library to obtain coarse poses; Reconstructing a cross-domain re-projection error objective function based on the shielding Lu Bangquan, and optimizing the coarse pose according to the cross-domain re-projection error objective function to obtain a refined pose. A job site image perspective alignment apparatus, the job site image perspective alignment apparatus comprising: The acquisition unit is used for responding to the image view angle alignment trigger signal of the target construction site, acquiring site image data of the target construction site and corresponding stable structural features of each type of BIM components in the BIM model, wherein the stable structural f