CN-116128985-B - Large-scale point cloud geometric compression method based on double-branch neural network

CN116128985BCN 116128985 BCN116128985 BCN 116128985BCN-116128985-B

Abstract

The invention discloses a large-scale point cloud geometric compression method based on a dual-branch neural network, which comprises the steps of representing point clouds as node sequences, acquiring the node sequences through dense characteristic windows and sparse characteristic windows, inputting acquisition results of the two windows into the dual-branch neural network based on a Transformer comprising dense context branches and sparse context branches, effectively extracting sparse large-scale contexts and local detail contexts, wherein the dense characteristic sampling results and the sparse characteristic sampling results can be complemented, avoiding compressing all the point clouds, reducing space occupation, fusing sparse context characteristics and dense context characteristics, determining node occupation codes and probability distribution, and encoding the occupation codes and the probability distribution through an entropy model, thereby finally realizing high-efficiency compression of the large-scale point clouds. The invention can reduce the occupied space of large-scale point cloud compression, has higher efficiency, and can be widely applied to the field of point cloud compression.

Inventors

ZUO DIAN
YU PENGPENG
LIANG FAN
SUN WEI

Assignees

中山大学

Dates

Publication Date: 20260505
Application Date: 20230216

Claims (8)

1. The large-scale point cloud geometric compression method based on the double-branch neural network is characterized by comprising the following steps of: the input point cloud is expressed in an octree form, and the octree is expressed as a node sequence through an ancestor node aggregation module; Sampling the node sequence by using a dense characteristic window to obtain a dense characteristic sampling result; sampling the node sequence by using a sparse feature window to obtain a sparse feature sampling result, wherein the window length of the dense feature window is smaller than that of the sparse feature window; Inputting the dense feature sampling result into dense context branches in a dual-branch neural network based on a transducer to obtain dense context features; inputting the sparse feature sampling result into sparse context branches in a dual-branch neural network based on a transducer to obtain sparse context features; fusing the dense context features and the sparse context features through a feature mixing module to obtain fused features; determining the occupation code and probability distribution of the occupation code of each node in the dense feature sampling result and the sparse feature sampling result according to the fusion feature through the double-branch neural network; inputting the occupied codes and probability distribution of the occupied codes into an entropy model to obtain a code stream which is output after the entropy model is encoded; The step of sampling the node sequence by using a dense feature window to obtain a dense feature sampling result comprises the following steps: Utilizing a dense feature window to acquire dense nodes in the node sequence in a sliding manner; connecting the occupation codes, the levels and the quadrant numbers of each dense node to form dense main feature vectors; Obtaining embedding for each of the dense nodes by executing a learnable embedding matrix on the dense principal eigenvector; aggregating each of the dense nodes Embedding of ancestors to construct a dense fusion embedding matrix, the dense fusion embedding matrix being a dense feature sampling result; The step of sampling the node sequence by using a sparse feature window to obtain a sparse feature sampling result comprises the following steps: Utilizing a sparse feature window to acquire sparse nodes in the node sequence in a sliding manner; Connecting the level of each sparse node with the quadrant numbers to form sparse main feature vectors; Obtaining embedding for each of the sparse nodes by executing a learnable embedding matrix on the sparse principal eigenvector; Aggregating each of the sparse nodes Ancestral embedding to construct a sparse fusion embedding matrix, wherein the sparse fusion embedding matrix is used as a sparse feature sampling result.
2. The method for large-scale point cloud geometric compression based on a dual-branch neural network according to claim 1, wherein the step of inputting the dense feature sampling result into dense context branches in the dual-branch neural network based on a Transformer to obtain dense context features comprises the steps of: The dense feature sampling results are input to a dense encoder network in a Transformer-based dual-branch neural network to obtain dense context features.
3. The method for compressing the large-scale point cloud geometry based on the dual-branch neural network according to claim 1, wherein the step of inputting the sparse feature sampling result into sparse context branches in the dual-branch neural network based on a Transformer to obtain sparse context features comprises the steps of: the sparse feature sampling result is input to an encoder of a sparse Transformer in a transform-based dual-branch neural network to obtain sparse context features.
4. The method for compressing a large-scale point cloud geometry based on a dual-branch neural network of claim 1, further comprising: For each node in the current frame, searching y nearest neighbor nodes in the previous frame by utilizing a KNN algorithm; Embedding of each node and its corresponding neighbor node are connected to obtain an inter-frame node embedding.
5. The method for compressing large-scale point cloud geometry based on a dual-branch neural network according to any one of claims 1-4, wherein said dense context branch and said sparse context branch each comprise two transformers and one MLP layer, and said feature blending module comprises one MLP layer and three transformers.
6. A device for compressing a large-scale point cloud geometry based on a dual-branch neural network, wherein the device is configured to implement the method for compressing a large-scale point cloud geometry based on a dual-branch neural network according to claim 1, and the device comprises: the node sequence acquisition unit is used for representing the input point cloud in an octree form and representing the octree as a node sequence through the ancestor node aggregation module; the device comprises a node sequence, a feature sampling unit, a sparse feature window, a feature sampling unit and a feature processing unit, wherein the node sequence is sampled by using a dense feature window to obtain a dense feature sampling result; the context feature extraction unit is used for inputting the dense feature sampling result into dense context branches in the dual-branch neural network based on the Transformer to obtain dense context features; the context feature fusion unit is used for fusing the dense context features and the sparse context features through the feature mixing module to obtain fusion features; the probability distribution determining unit is used for determining the probability distribution of the occupied codes and the occupied codes of each node in the dense feature sampling result and the sparse feature sampling result according to the fusion feature through the double-branch neural network; and the compression coding unit is used for inputting the occupied codes and probability distribution of the occupied codes into an entropy model to obtain a code stream which is output after the entropy model is coded.
7. An electronic device comprising a processor and a memory; the memory is used for storing programs; the processor executing the program implements the method of any one of claims 1 to 5.
8. A computer-readable storage medium, characterized in that the storage medium stores a program that is executed by a processor to implement the method of any one of claims 1 to 5.

Description

Large-scale point cloud geometric compression method based on double-branch neural network Technical Field The invention relates to the technical field of point cloud compression, in particular to a large-scale point cloud geometric compression method based on a double-branch neural network. Background Point clouds, a flexible 3D data representation, are typically composed of coordinates of points in three-dimensional space and related attributes (e.g., color and reflectivity). Point clouds have been widely used in many fields such as autopilot, virtual reality, and robotics. However, the storage space of the point cloud is extremely large, which makes storage and transmission difficult, and an efficient point cloud compression method is highly demanded. Some depth learning based methods employ sparse 3D convolution to better model the dependencies between the code and the code voxels. In existing CNN-based models, spatial context information is better extracted than in hand-made models. However, due to the limited acceptance domains, they collect geometric context in a relatively narrow range, which can lead to poor performance over a large point cloud. Therefore, the existing large-scale point cloud compression technology has the defects of low efficiency, large occupied space and the like. Disclosure of Invention In view of this, the embodiment of the invention provides a large-scale point cloud geometric compression method based on a dual-branch neural network, which reduces occupied space and is efficient. An aspect of the embodiments of the present invention provides a method for geometric compression of a large-scale point cloud based on a dual-branch neural network, including: the input point cloud is expressed in an octree form, and the octree is expressed as a node sequence through an ancestor node aggregation module; Sampling the node sequence by using a dense characteristic window to obtain a dense characteristic sampling result; sampling the node sequence by using a sparse feature window to obtain a sparse feature sampling result, wherein the window length of the dense feature window is smaller than that of the sparse feature window; Inputting the dense feature sampling result into dense context branches in a dual-branch neural network based on a transducer to obtain dense context features; inputting the sparse feature sampling result into sparse context branches in a dual-branch neural network based on a transducer to obtain sparse context features; fusing the dense context features and the sparse context features through a feature mixing module to obtain fused features; determining the occupation code and probability distribution of the occupation code of each node in the dense feature sampling result and the sparse feature sampling result according to the fusion feature through the double-branch neural network; And inputting the occupied codes and probability distribution of the occupied codes into an entropy model to obtain a code stream which is output after the entropy model is encoded. Preferably, the sampling the node sequence by using a dense feature window to obtain a dense feature sampling result includes: Utilizing a dense feature window to acquire dense nodes in the node sequence in a sliding manner; connecting the occupation codes, the levels and the quadrant numbers of each dense node to form dense main feature vectors; Obtaining embedding for each of the dense nodes by executing a learnable embedding matrix on the dense principal eigenvector; The ancestors of each of the dense nodes are aggregated embedding to form a dense fusion embedding matrix, which dense fusion embedding matrix serves as a dense feature sampling result. Preferably, the inputting the dense feature sampling result into dense context branches in a dual-branch nerve network based on a Transformer to obtain dense context features includes: The dense feature sampling results are input to a dense encoder network in a Transformer-based dual-branch neural network to obtain dense context features. Preferably, the sampling the node sequence by using a sparse feature window to obtain a sparse feature sampling result includes: Utilizing a sparse feature window to acquire sparse nodes in the node sequence in a sliding manner; Connecting the level of each sparse node with the quadrant numbers to form sparse main feature vectors; Obtaining embedding for each of the sparse nodes by executing a learnable embedding matrix on the sparse principal eigenvector; and aggregating embedding of ancestors of each sparse node to form a sparse fusion embedding matrix, wherein the sparse fusion embedding matrix is used as a sparse feature sampling result. Preferably, the inputting the sparse feature sampling result into a sparse context branch in a dual-branch neural network based on a transform to obtain a sparse context feature includes: the sparse feature sampling result is input to an encoder of a sparse Transformer in a transform-ba