CN-121982714-A - Automatic labeling method for outer surface of three-dimensional model based on image segmentation mapping

CN121982714ACN 121982714 ACN121982714 ACN 121982714ACN-121982714-A

Abstract

The invention belongs to the technical field of computer vision and large-scale three-dimensional geometric data labeling, and relates to an automatic labeling method for the outer surface of a three-dimensional model based on image segmentation mapping. The method comprises the steps of S1, importing a triangular mesh model to be marked, arranging N virtual cameras, outputting N two-dimensional color images and depth maps, S2, inputting the two-dimensional color images into a pre-trained high-performance semantic segmentation network, outputting semantic labels of two-bit pixels and quantized segmentation confidence, S3, projecting a three-dimensional network patch center point to two-dimensional pixel coordinates by using camera parameters to perform depth consistency verification, S4, collecting all effective labels and confidence degrees of patches observed by a plurality of visual angles, and determining final labels and comprehensive confidence degrees, S5, automatically locking a high-reliability area, and marking a low-reliability or conflict area as a special label to be checked. The method provided by the invention obviously improves the efficiency, precision and consistency of semantic annotation of the large-scale three-dimensional model.

Inventors

LI ZIXIANG
LIAO NIANDONG

Assignees

长沙理工大学

Dates

Publication Date: 20260505
Application Date: 20260106

Claims (6)

1. The automatic labeling method for the outer surface of the three-dimensional model based on the image segmentation mapping is characterized by comprising the following steps: step 1, importing a triangular mesh model to be marked, obtaining vertex coordinates and topology information of the triangular mesh model, arranging N virtual cameras at a certain distance from the outer surface of the three-dimensional model, observing all visible outer surfaces of the model by at least k cameras, setting camera parameters, performing projection rendering on the three-dimensional model, and outputting N two-dimensional color images and depth maps; Step 2, inputting the two-dimensional color image output in the step 1 into a pre-trained high-performance semantic segmentation network, and outputting semantic labels of two-dimensional pixel points And quantized segmentation confidence ; Step 3, for any grid patch in the three-dimensional model Determining its geometric center point Using known camera parameters And Will be Projected to the first Two-dimensional image coordinates of individual cameras On, calculate the projection point Depth in camera coordinate system Is marked as If (if) The dough sheet is considered Effectively observed by a camera, and records semantic tags of the camera Confidence level If the difference is large, it is determined that Is shielded or shielded by other objects A label positioned on the back of the dough sheet and not recording the viewing angle; Step 4, collecting labels and confidence provided by all effective visual angles of the surface patches observed by the multiple visual angles, carrying out conflict resolution by adopting a confidence weighted voting mechanism, and determining a final label Is of integrated confidence of (1) ; Step 5, when the comprehensive confidence coefficient of the dough sheet is higher than a preset threshold value, the final label is obtained Permanent giving And (3) finishing automatic labeling, and marking the dough sheet with the comprehensive confidence coefficient lower than a preset threshold value as a special label to be checked.
2. The method for automatically labeling the outer surface of the three-dimensional model based on image segmentation mapping according to claim 1, wherein the step 1 of setting camera parameters specifically comprises the steps of precisely setting the focal length of the internal parameters of each virtual camera: Main points: external parameters, rotation matrix Translation vector 。
3. The method for automatically labeling the outer surface of the three-dimensional model based on the image segmentation map as set forth in claim 2, wherein the rendering in the step 1 utilizes a standard graphics rendering pipeline.
4. The method for automatically labeling the outer surface of the three-dimensional model based on the image segmentation mapping according to claim 3, wherein the high-precision semantic segmentation network in the step 2 is PSPNet, deepLabV3+ or Vision Transformer-based segmentation model.
5. The method for automatically labeling an external surface of a three-dimensional model based on image segmentation mapping according to claim 4, wherein the conflict resolution performed by adopting a confidence weighted voting mechanism in the step 4 is specifically implemented by adopting a nonlinear weighting function Weighted voting is carried out by adopting a formula, and a final label is determined 。
6. The method for automatically labeling the outer surface of a three-dimensional model based on image segmentation mapping according to claim 4, wherein the comprehensive confidence is determined in the step 4 The method comprises the following steps: ; Wherein, the Is an effective set of viewing angles verified by geometric constraints.

Description

Automatic labeling method for outer surface of three-dimensional model based on image segmentation mapping Technical Field The invention belongs to the technical field of computer vision and large-scale three-dimensional geometric data labeling, and relates to an automatic labeling method for the outer surface of a three-dimensional model based on image segmentation mapping. Background In leading edge applications such as modern city construction, intelligent planning, virtual Reality (VR), augmented Reality (AR), and automated driving high-precision map construction, three-dimensional geometric models with rich semantic information have become core data assets. Such semantic information, including but not limited to the type of structure of the exterior surface of the building, such as windows, doors, walls, balconies, roofs, molding, etc., is the basis for achieving model analysis, query, and interaction. At present, the method for realizing semantic annotation of a three-dimensional model grid (Mesh) structure mainly can be divided into the following three types, but all have limitations which are difficult to overcome (1) a purely manual annotation method, which is the most traditional and basic method. Labeling personnel select vertices, edges or patch elements of the three-dimensional model surface one by one or in batches by means of software-supplied tools using specialized computer-aided design (CAD) or three-dimensional modeling software (e.g., blender, autodesk Maya, 3ds Max, skectchup, etc.). These selected geometric elements are then assigned manually to corresponding semantic tags on the labels. In the face of a complex building model containing hundreds of thousands or even millions of panels, the repeatability of manual operations is extremely high, and the workload is increased in geometric progression. Labeling a building model of moderate complexity may take weeks or even months, resulting in extremely long data production cycles, severely lagging the requirements of the project. The labeling of the high-precision three-dimensional model belongs to the technical industry, the required labor cost is extremely high, and the large-scale data integrator is difficult to bear. The labeling result is extremely easy to be subjected to subjective judgment and fatigue of labeling personnel and influence on understanding difference of labeling specification, so that consistency of labeling boundary and semantic assignment is difficult to guarantee. (2) The method is mainly aimed at original three-dimensional point cloud data acquired through Lidar photogrammetry, and is based on a segmentation method of point cloud or laser radar (Lidar) data. Clustering and geometric segmentation are carried out on the point cloud data by utilizing a point cloud processing algorithm (such as RANSAC, region growing and deep learning such as PointNet series). Point cloud segmentation is based primarily on geometric features (e.g., normals, curvatures, flatness, etc.). For regions with completely different semantically similar geometric characteristics, such as a smooth white wall surface and a smooth white gate, the algorithm is difficult to distinguish the high-level semantics of the regions. Even if the point cloud is segmented, to accurately transfer the point cloud label to the Mesh model after meshing and topology optimization, a complex point-surface matching and interpolation process is still required, and new errors are easily introduced in the process. The raw Lidar point cloud data typically contains a large amount of noise, sparse regions, or voids caused by occlusion, which directly affect the reliability of the geometric feature-based segmentation results. (3) Several studies have attempted to acquire photographs of three-dimensional objects from multiple angles based on projection view multi-view reconstruction, with three-dimensional models reconstructed from structural motion (SfM) or multi-view stereovision (MVS). Then, an attempt is made to assist the three-dimensional model by using the two-dimensional image features. However, the method relies on real world image acquisition, has huge data volume and is easily influenced by illumination, weather and camera calibration errors. A fine registration error always exists between the image and the reconstructed three-dimensional model, so that a dislocation phenomenon is generated during label mapping. The SfM and MVS processes are computationally intensive, and the whole process consumes huge computing resources in combination with subsequent feature extraction and mapping, so that quick and low-cost labeling is difficult to realize. In summary, the prior art always faces a core contradiction of how to greatly improve the labeling efficiency of the three-dimensional model on the premise of ensuring the high precision of the semantic labeling of the three-dimensional model, and simultaneously reduce the manpower and the calculation cost. Disclosu