CN-122024222-A - Self-adaptive neighborhood selection point cloud self-supervision classification and segmentation method, device and medium

CN122024222ACN 122024222 ACN122024222 ACN 122024222ACN-122024222-A

Abstract

The invention belongs to the technical field of three-dimensional point cloud self-supervision expression learning and three-dimensional computer vision, and discloses a self-adaptive neighborhood selection point cloud self-supervision classification and segmentation method, device and medium, which comprises the following steps of S1, data acquisition, namely downloading point cloud sample data from a public point cloud data source and storing the point cloud sample data into storage equipment, and reading the point cloud sample data from the storage equipment during training; S2, construction of structural relations, S3, selection of self-adaptive neighborhood scale, S4, construction of fusion representation, S5, model training, S6, model testing and application. The invention realizes the enhancement of the modeling capability of the local structure of the point cloud under the mask reconstruction frame, adapts to the non-uniform density and geometric complexity change of the point cloud, improves the generalization and robustness of the point cloud representation, and provides a new technical path and application prospect for the intelligent perception and three-dimensional scene understanding of the point cloud.

Inventors

ZHU SHOUZHENG
YANG LIU
ZHANG YANGYANG
LIU XU
YANG WENHANG
DUAN JINYI
Wang Ceyuan
LI CHUNLAI
CHEN YUWEI

Assignees

国科大杭州高等研究院

Dates

Publication Date: 20260512
Application Date: 20260407

Claims (8)

1. The self-adaptive neighborhood-selected point cloud self-supervision classification and segmentation method is characterized by comprising the following steps of: S1, acquiring data, namely acquiring point cloud sample data from a public point cloud data source and/or actual scanning equipment and storing the point cloud sample data into storage equipment; s2, constructing a structural relationship, namely performing K neighbor search on each point before masking to establish an adjacency graph, and extracting geometric relationship edge features among the points to obtain point-level structural features; S3, self-adaptive neighborhood scale selection, namely constructing a plurality of neighbor scale candidate neighborhoods for each point, extracting multi-scale structural features, outputting scale scores through a gating selection network, determining target scales through sparse activation, and obtaining self-adaptive point-level structural features; S4, constructing a fusion representation, namely grouping point clouds into points cloud mass, extracting block-level semantic features, aligning the point-level structural features according to the grouping, polymerizing the point-level structural features in blocks to obtain block-level structural features, splicing the point-level structural features and the block-level structural features, and generating the fusion representation through a mapping network; S5, model training, namely executing masking on the fusion representation to obtain a visible block and a masking block, embedding the visible block, the masking block and the position into an input decoder to reconstruct point cloud coordinates of the masked points cloud mass, and obtaining a pre-trained model through reconstructing loss optimization model parameters; And S6, model testing and application, namely migrating the pre-trained model to a classification or segmentation downstream task, outputting classification or segmentation labels in a test set, calculating an evaluation index, adjusting mask proportion, a neighborhood scale set and iterative optimization of network and training super parameters according to the index, and displaying, storing or using the result in three-dimensional perception application.
2. The self-supervised classification and segmentation method of point clouds for adaptive neighborhood selection as claimed in claim 1, wherein said S2 comprises the steps of: S2.1, constructing an adjacency graph, namely performing K neighbor searching on each point p i in the point cloud based on three-dimensional coordinates to obtain a neighbor set N (i), and establishing edge connection between the point p i and a neighbor point p j thereof to form a point-level adjacency graph; S2.2, edge feature extraction, namely constructing edge feature vectors for each edge (p i ,p j ), wherein the edge feature vectors at least comprise a relative displacement [ p i ,p j -p i ] form and are embedded by a mapping network; S2.3, generating structural features, namely aggregating edge embedding connected with the point p i to obtain point-level structural features of the point p i , wherein the point-level structural features are used for representing geometric relations and local topological dependencies among points.
3. The self-supervised classification and segmentation method for adaptive neighborhood selection of point clouds of claim 1, wherein S3 comprises: S3.1, constructing a multi-scale candidate neighborhood, namely constructing candidate neighborhood scales with at least two different neighbor numbers for each point Respectively constructing neighbor sets of corresponding scales; s3.2, setting a scale structure expert network for each neighborhood scale for extracting structural features under the scale; And S3.3, gating selection and sparse activation, namely constructing a gating selection network to output scores of all scale experts, and selecting the scale expert with the highest score by adopting a Top-1 sparse activation strategy as a target scale expert to output the self-adaptive point-level structural characteristics.
4. A method of self-supervised classification and segmentation of point clouds for adaptive neighborhood selection as claimed in claim 3, wherein the neighborhood scale is a three-scale arrangement corresponding to small scale, medium scale and large scale respectively.
5. The self-supervised classification and segmentation method of point clouds for adaptive neighborhood selection as claimed in claim 1, wherein said S4 comprises the steps of: S4.1, generating a semantic block, namely selecting a center point set from a point cloud by adopting furthest point sampling, and performing K neighbor search on each center point to form a point cloud mass, so as to generate an index matrix of the center point and the points in the block; s4.2, extracting semantic features, namely inputting each point cloud mass into a semantic feature extraction network to obtain block-level semantic features; S4.3, aggregating the structural features, namely multiplexing the index matrix, extracting an intra-block structural feature set corresponding to each center point from the point-level structural features, and aggregating the intra-block structural feature set to obtain block-level structural features; And S4.4, fusing the semantics and the structural features, namely splicing the block-level semantic features and the block-level structural features in the channel dimension, and generating fusion mark representation through convolution mapping.
6. The self-adaptive neighborhood-selected point cloud self-supervised classification and segmentation method as set forth in claim 1, wherein the S5 includes: s5.1, the encoder adopts a transducer structure, only inputs visible marks and position embedding thereof during pre-training, and extracts global context characteristics by using a self-attention mechanism; s5.2, the decoder adopts a light-weight converter structure, inputs a visible mark, a mask mark and position embedding thereof, and outputs a coordinate reconstruction result of the masked point cloud mass; S5.3, the reconstruction loss adopts a chamfering distance based on l 2 , is used for measuring the bidirectional distance between the reconstruction point set and the real point set, and updates model parameters according to the reconstruction loss.
7. An adaptive neighborhood-selected point cloud self-supervision classification and segmentation method device, comprising one or more processors for implementing the adaptive neighborhood-selected point cloud self-supervision classification and segmentation method according to any one of claims 1-6.
8. A computer readable storage medium, having stored thereon a program which, when executed by a processor, implements a method of self-supervised classification and segmentation of point clouds for adaptive neighborhood selection as claimed in any of claims 1-6.

Description

Self-adaptive neighborhood selection point cloud self-supervision classification and segmentation method, device and medium Technical Field The invention relates to the technical field of three-dimensional point cloud self-supervision representation learning and three-dimensional computer vision, in particular to a self-adaptive neighborhood selection point cloud self-supervision classification and segmentation method, device and medium. Background In the field of three-dimensional computer vision, point cloud classification and point cloud segmentation are important bases for realizing three-dimensional scene understanding, and are widely applied to scenes such as automatic driving, robot navigation and three-dimensional reconstruction. However, the point cloud data has disorder and irregularity, and there are often uneven point density, occlusion loss and noise interference, resulting in limited robustness and generalization of the point cloud understanding model in complex environments. In the prior art, the point cloud understanding model is subjected to multi-dependency supervised learning, a large amount of annotation data is needed, and the three-dimensional point cloud annotation cost is high, so that the diversity of complex scenes is difficult to cover. To reduce annotation dependence, a mask reconstruction class self-supervised pre-training framework is used for point cloud representation learning, learning generic characterizations through mask portions input and reconstructing missing portions. However, the existing mask reconstruction method still has the problems of insufficient modeling of local structure, structural continuity damage of masks, difficulty in adapting to non-uniform density and geometric complexity change of fixed neighborhood scale, redundancy and conflict brought by multi-scale parallelism, and the like, so that a method capable of enhancing structural modeling and realizing neighborhood scale self-adaptive selection under a mask reconstruction frame is needed. Disclosure of Invention The invention aims to provide a self-adaptive neighborhood-selected point cloud self-supervision classification and segmentation method, device and medium, which are used for solving the problems of insufficient modeling dependence of a point cloud structure, structural continuity damage caused by masking, poor suitability of a fixed neighborhood scale and the like in the prior art. In order to achieve the above purpose, the present invention provides the following technical solutions: a self-adaptive neighborhood-selected point cloud self-supervision classification and segmentation method comprises the following steps: S1, acquiring data, namely acquiring point cloud sample data from a public point cloud data source and/or actual scanning equipment and storing the point cloud sample data into storage equipment; s2, constructing a structural relationship, namely performing K neighbor search on each point before masking to establish an adjacency graph, and extracting geometric relationship edge features among the points to obtain point-level structural features; S3, self-adaptive neighborhood scale selection, namely constructing a plurality of neighbor scale candidate neighborhoods for each point, extracting multi-scale structural features, outputting scale scores through a gating selection network, determining target scales through sparse activation, and obtaining self-adaptive point-level structural features; S4, constructing a fusion representation, namely grouping point clouds into points cloud mass, extracting block-level semantic features, aligning the point-level structural features according to the grouping, polymerizing the point-level structural features in blocks to obtain block-level structural features, splicing the point-level structural features and the block-level structural features, and generating the fusion representation through a mapping network; S5, model training, namely executing masking on the fusion representation to obtain a visible block and a masking block, embedding the visible block, the masking block and the position into an input decoder to reconstruct point cloud coordinates of the masked points cloud mass, and obtaining a pre-trained model through reconstructing loss optimization model parameters; And S6, model testing and application, namely migrating the pre-trained model to a classification or segmentation downstream task, outputting classification or segmentation labels in a test set, calculating an evaluation index, adjusting mask proportion, a neighborhood scale set and iterative optimization of network and training super parameters according to the index, and displaying, storing or using the result in three-dimensional perception application. Further, the step S2 includes the steps of: S2.1, constructing an adjacency graph, namely performing K neighbor searching on each point p i in the point cloud based on three-dimensional coordinates to obtain a neig