CN-121999477-A - Identifying and matching assets in 3D data

CN121999477ACN 121999477 ACN121999477 ACN 121999477ACN-121999477-A

Abstract

The invention relates to identifying and matching assets in 3D data. A computer-implemented method for automatically identifying a representation of an asset in a 3D model includes segmenting the 3D model into a plurality of cluster elements, at least a subset of which includes a representation of the asset, organizing each cluster element into an asset structure including a plurality of nodes and edges, and computing an asset insert for each node and edge, each insert representing the asset or a portion thereof, the method further including, in particular in real time, obtaining a search template from a user that corresponds to the user selected asset or representation of the asset, generating a template structure including a plurality of nodes and edges and a template insert for each node and edge, each template insert representing the user selected asset or a portion thereof, comparing the template inserts with the asset inserts of the plurality of asset structures to find a representation of the asset in the model that is similar to the representation of the user selected asset, and providing information about the found representation of the asset to the user.

Inventors

WINISTOERFER MARTIN
A. Prado de Torreblanca
L. Lopez Fernandez

Assignees

赫克斯冈技术中心

Dates

Publication Date: 20260508
Application Date: 20251103
Priority Date: 20241105

Claims (15)

1. A computer-implemented method (100) for automatically identifying a representation (20) of an asset in a 3D model (1) of an environment, the method comprising the steps of: -segmenting (110) the 3D model (1) into a plurality of cluster elements, at least a subset of the cluster elements comprising a representation (20) of the asset; -organizing (120) each cluster element into an asset structure (25) comprising a plurality of nodes (26, 27) and edges, and -Calculating (130) asset inserts (28) for each of the nodes (26, 27) and edges, each asset insert representing an asset or a portion thereof; wherein the method further comprises the steps of: -obtaining (150) a search template from the user, the search template corresponding to the user-selected asset or a representation (10) of the user-selected asset; -generating (160) a template structure (15) based on the obtained search templates, the template structure (15) comprising a plurality of nodes (16, 17) and edges and template inserts (18) for each of the nodes (16, 17) and edges, each template insert (18) representing the user selected asset or a part thereof; -comparing (170) the template embedment (18) with the asset embedments (28) of a plurality of asset structures (25) to find a representation (20) of an asset in the model (1) that is similar to the representation (10) of the user selected asset, and -Providing (190) the user with information about the representation (20) of the found asset.
2. The method (100) according to claim 1, wherein the asset structure (25) and the template structure (15) are a graph structure or a tree structure, in particular an octree, In particular, wherein the template structure (15) is generated (160) such that the template embedding (18) has the same format as the asset embedding (28).
3. The method (100) of claim 1 or claim 2, wherein the 3D model (1) is a point cloud or grid, in particular wherein the environment comprises a plurality of assets of a plurality of different asset types.
4. The method (100) of any of the preceding claims, wherein comparing (170) the template embedding (18) with the asset embedding (28) is performed by a trained neural network.
5. The method (100) according to any one of the preceding claims, the method comprising a preprocessing (105) of the 3D model (1), the preprocessing (105) being completed before obtaining (150) the templates from the user, wherein at least a segmentation (110) of the 3D model and an organization (120) of the cluster elements are part of the preprocessing (105), in particular wherein a node embedded computation (130) for each of the nodes and edges is also part of the preprocessing (105).
6. The method (100) of any one of the preceding claims, wherein at least the steps of generating (160) a template structure (15), comparing (170) the template embedding (18) with the asset embedding (28), and providing (190) information to the user are performed in real time.
7. The method (100) according to any one of the preceding claims, the method comprising displaying (140) the 3D model (1) to a user, in particular wherein the displayed 3D model is a preprocessed 3D model.
8. The method (100) of claim 7, wherein obtaining (150) a template from a user includes enabling the user to select a representation (20) of the asset in the displayed 3D model (1).
9. The method (100) of claim 7 or claim 8, wherein providing (190) the user with information about the representation (20) of the found asset comprises displaying the representation (20) of the found asset in the 3D model (1) in a highlighted manner.
10. The method (100) according to any one of the preceding claims, wherein the 3D model (1) is a point cloud and the obtained search template is or comprises a parameterized model, in particular a BIM or CAD model, and generating (160) a template structure (15) comprises: -generating a synthetic point cloud from the parameterized model, in particular using data enhancement, or -Comparing one or more existing similar parameterized models of a particular asset and using the existing point clouds of that asset.
11. The method (100) of any one of the preceding claims, wherein the obtained search template comprises at least one of text, video and image related to the user-selected asset, and generating (160) a template structure (15) comprises identifying the user-selected asset from the text, video or image, in particular wherein the 3D model (1) is a point cloud, and generating (160) a template structure (15) further comprises using an existing point cloud of the identified user-selected asset.
12. The method (100) according to any one of the preceding claims, the method comprising filtering (180) outliers in the representation (20) of the found asset before providing (190) the information to the user, In particular, wherein the method comprises enabling the user to provide feedback on the provided representation (20) of the found asset, wherein the feedback is used to improve the filtering (180), in particular wherein the feedback comprises examples of selected true positives and false positives.
13. The method (100) according to any one of the preceding claims, wherein organizing (130) the clustered elements into asset structures (25) is performed by a trained neural network, in particular wherein, -Generating (160) a template structure (15) to be performed by the same trained neural network, and/or -The trained neural network is a graph neural network that has been trained using symbol rules to learn relational constraints.
14. The method (100) of any one of the preceding claims, wherein organizing (130) the cluster elements into an asset structure (25) comprises: -identifying and grouping assets using visual basis models on the image; -implementing multi-view consistency; -building a structure of all assets based on the neighborhood; -refining the structure to a desired granularity; -creating an embedded vector for each element, and Assigning the embedded vector to the structure, In particular wherein the asset structure (25) is a layered structure.
15. A computer program product comprising program code with computer executable instructions for performing the method (100) according to any one of claims 1 to 14.

Description

Identifying and matching assets in 3D data Technical Field The present invention relates to a method of automatically identifying a representation of an asset in a three-dimensional dataset, such as a point cloud. Background It is known to capture 3D data of the surrounding environment as a point cloud, for example using a laser scanner or similar capture device. The captured 3D data may then be visualized to a user on a display. Sometimes, it is necessary to find a certain asset in the 3D data. For example, a user may want to find a particular machine or a particular type of door in the point cloud data. Depending on the size of the 3D data and the number of assets in the data, this can be a cumbersome and time consuming task. It is therefore desirable to provide a method that allows a user to easily initiate an automatic search for assets in 3D data. One solution is to run object detection algorithms in 3D for a particular type of asset. To differentiate between assets, it would be necessary to train algorithms on many different types of assets. Disadvantageously, the specified assets must be known in advance, i.e., prior to training and deploying the algorithm. This results in a very large number of categories, thereby increasing the necessary size of the neural network capacity. Furthermore, no or a limited amount of asset data is available for training. Adding a large number of categories to the limited amount of data for certain categories creates a data imbalance during training. Thus, the algorithm tends to classify the unseen objects as the most common categories used during training, or to detect less common categories of objects. Alternatively, rather than finding the asset directly in 3D space, 2D object detection is used to locate the asset on the corresponding image and once identified, project the asset into 3D to highlight the asset. Disadvantageously, this solution works only when an image and an accurate projection between the image and the point cloud are available in the data. Other drawbacks include multi-view aggregation, i.e. the process of multiple detections of the same object, and poor performance in distinguishing between the foreground and the background of the point cloud (i.e. between the points of the object and the points behind the object from the perspective of the image). US2023/0028242A1 discloses generating a unique identification code for an industrial commodity and US11,704,343B2 discloses an artificial intelligence based method for correlating data about physical world assets from multiple databases. Point cloud registration (i.e., aligning different scans with a single geometrically identical point cloud) is a major application of 3D feature descriptors. For example, K.Fu et al disclose a method for 3D point cloud registration, "Robust Point Cloud Registration Framework Based on DEEP GRAPH MATCHING",2022, 11,9 (arXiv:2211.04696 v 1). Zeng et al, "3DMatch: learning Local Geometric Descriptors from RGB-D Reconstructions," 2017, 4, 9 (arXiv:1603.08182v3), disclose a method for matching local geometric features on a real world depth image. While this approach follows a 2D paradigm, i.e., using feature descriptors and matching algorithms to find correspondence between features in 3D space, these features do not represent objects or assets. There is no association between the 3D feature and the actual asset type (e.g., asset from a directory or library). The asset may be considered a whole or a summation of sub-portions thereof. This involves extending the rationale of 2D matching into 3D space and providing a solution on how to handle and compare similar assets or objects. For example, this task of finding assets in 3D space based on templates may be accomplished by creating a 3D template, calculating the embedding of the template, processing point cloud samples with the same window size, calculating a list of 3D embeddings, and comparing the list to the template embeddings to identify potential matches. Although this method is similar to the 2D matching method, in the 3D space, disadvantageously, the sampling calculation cost of the point cloud is very high. Moreover, real-time calculation is not possible and the user needs to wait several minutes to get the result. Furthermore, it is difficult to segment edges and fine details in the point cloud compared to the image. Disclosure of Invention It is therefore an object of the present invention to provide an improved method for identifying assets in a point cloud based on user input. It is a particular object to provide such a method that allows reliable and real-time identification of assets. The claimed invention relates to a computer-implemented method for automatically identifying a representation of an asset in a 3D model of an environment. The 3D model may be, for example, a point cloud or grid, and the environment may include a plurality of assets of a plurality of different asset types. The method comprises the fol