CN-121999065-A - Dynamic point cloud geometric compression method based on space-time feature enhancement and related equipment
Abstract
The invention discloses a dynamic point cloud geometric compression method and related equipment based on space-time characteristic enhancement, the method comprises the steps of carrying out multi-layer downsampling on a reference point cloud and an original point cloud to obtain an L-layer reference compression point cloud and an original compression point cloud, carrying out entropy encoding and decoding on the L-layer original compression point cloud to obtain an L-layer reconstruction point cloud, enabling l=L, carrying out upsampling on a first-layer reconstruction point cloud to obtain an L-1-layer expansion point cloud and carrying out geometric encoding and decoding on the L-1-layer original compression point cloud to obtain an L-1-layer coarse reconstruction point cloud, carrying out time domain modeling on the L-1-layer coarse reconstruction point cloud and the reference compression point cloud to obtain an L-1-layer fusion alignment feature, carrying out decrementing operation on L to circularly reconstruct to l=1, and obtaining a final reconstruction point cloud according to the 1-layer reconstruction point cloud. The method can meet the compression requirement of the complex point cloud, and can be widely applied to the technical field of point cloud data processing.
Inventors
- LIANG FAN
- LIU XIN
- LIU XIANGZUO
Assignees
- 中山大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260119
Claims (10)
- 1. A dynamic point cloud geometric compression method based on space-time feature enhancement, which is characterized by comprising the following steps: acquiring point cloud data, wherein the point cloud data comprises a reference point cloud and an original point cloud; Performing first multi-layer downsampling on the reference point cloud and the original point cloud to correspondingly obtain a layer 1 reference compression point cloud and a layer 1 original compression point cloud; Performing second multi-layer downsampling on the reference compression point cloud of the layer 1 and the original compression point cloud of the layer 1 to correspondingly obtain a reference compression point cloud set and an original compression point cloud set, wherein the reference compression point cloud set comprises the reference compression point clouds of the layers 2 to L, and the original compression point cloud set comprises the original compression point clouds of the layers 2 to L; performing first entropy encoding and decoding on the original compressed point cloud of the L layer to obtain an reconstructed point cloud of the L layer; Initializing L to L; performing primary up-sampling on the reconstruction point cloud of the first layer to obtain an expansion point cloud of the first layer-1, and performing geometric coding and decoding based on the expansion point cloud of the first layer-1 and the original compression point cloud of the first layer-1 to obtain a coarse reconstruction point cloud of the first layer-1; Performing time domain modeling by using an offset guided interpolation technology based on the coarse reconstruction point cloud of the layer 1 and the reference compression point cloud of the layer 1 to obtain a layer 1 fusion alignment feature; Performing second entropy encoding and decoding based on residual error generation based on the fusion alignment feature of the layer I-1 and the original compressed point cloud of the layer I-1 to obtain the reconstructed point cloud of the layer I-1; Performing decrementing operation on l, if l is not equal to 1, returning to execute the step of performing one-time upsampling on the reconstructed point cloud of the first layer until l is equal to 1; And obtaining a final reconstructed point cloud based on the first multi-layer up-sampling processing according to the reconstructed point cloud of the layer 1.
- 2. The method of claim 1, wherein the performing a first multi-layer downsampling of the reference point cloud and the original point cloud corresponds to obtaining a first reference compressed point cloud and a first original compressed point cloud, comprising the steps of: Processing the reference point cloud through a plurality of lossy compression modules to obtain a first reference compression point cloud; Processing the original point cloud through a plurality of lossy compression modules to obtain a first original compression point cloud; The lossy compression module is configured with a first downsampling for the reference point cloud and a second downsampling for the original point cloud, wherein the sampling multiples of the first downsampling and the second downsampling are the same.
- 3. The method according to claim 1, wherein the performing a first entropy encoding and decoding on the L-th layer of the original compressed point cloud to obtain an L-th layer of reconstructed point cloud includes the following steps: rounding and quantizing the original compression point cloud of the L layer to obtain first quantized data; Performing first entropy coding on the first quantized data to obtain a first code stream, wherein the first code stream is used for storage and transmission; and performing first entropy decoding on the first code stream to obtain the reconstructed point cloud of the L layer, wherein a first entropy model is configured for the first entropy encoding and the first entropy decoding and is used for modeling the characteristic distribution, and when the original compressed point cloud of the L layer is a non-single characteristic point, the first entropy model is also used for modeling the coordinate distribution.
- 4. The method according to claim 1, wherein the up-sampling the reconstruction point cloud of the first layer once to obtain an expansion point cloud of the first-1 layer, and performing geometric encoding and decoding based on the expansion point cloud of the first-1 layer and the original compression point cloud of the first-1 layer to obtain a coarse reconstruction point cloud of the first-1 layer, includes the following steps: Performing one-time generation type multiple up-sampling on the reconstruction point cloud of the first layer to obtain the expansion point cloud of the first layer-1, and generating a binary mask based on the coordinate set of the expansion point cloud of the first layer-1 and the original compression point cloud of the first layer-1, wherein the binary mask is used for representing the geometric information of the original compression point cloud of the first layer-1; Based on the expansion point cloud of the layer I-1, carrying out occupation probability prediction by using a preset neural network to obtain prediction distribution; dividing the binary mask into a preset number of group of component masks according to the spatial positions of different voxels; Taking the first group of the segmentation masks as target masks; Performing second entropy coding on the target mask based on the prediction distribution to obtain a second code stream, wherein the second code stream is used for storage and transmission; performing second entropy decoding on the second code stream to obtain a geometric mask; Combining the geometric mask with the prediction distribution to update the prediction distribution with a next set of the segmentation masks as the target mask; Returning to the step of executing the second entropy coding on the target mask based on the prediction distribution until all the segmentation masks are traversed, and obtaining a complete geometric mask according to the finally updated prediction distribution; and removing non-occupied points through pruning technology based on the complete geometric mask to obtain the coarse reconstruction point cloud of the first layer-1.
- 5. The method according to claim 1, wherein the time domain modeling is performed by using an offset guided interpolation technique based on the coarse reconstructed point cloud of the layer 1 and the reference compressed point cloud of the layer 1, to obtain a layer 1 fusion alignment feature, comprising the following steps: Taking the coarse reconstruction point cloud of the layer 1 as a current point cloud frame, and taking the reference compression point cloud of the layer 1 as a reference point cloud frame; based on the current point cloud frame and the reference point cloud frame, predicting by using a preset neural network to obtain a point-by-point offset from the current point cloud frame to the reference point cloud frame; the point-by-point offset is acted on the current point cloud frame to obtain reverse mapping coordinates; interpolation sampling is carried out in a feature space corresponding to the reference point cloud frame based on the reverse mapping coordinates, so that reference features are obtained; And aggregating the reference features and original features corresponding to the current point cloud frame to obtain the fusion alignment features of the first layer-1.
- 6. The method according to claim 1, wherein the performing a second entropy codec based on residual generation based on the fusion alignment feature of the layer 1 and the original compressed point cloud of the layer 1, to obtain the reconstructed point cloud of the layer 1, includes the steps of: Fusing the fusion alignment feature of the layer I-1 and the original compression point cloud of the layer I-1, and further sequentially generating and rounding and quantizing through the residual error to obtain second quantized data; Performing third entropy coding on the second quantized data to obtain a third code stream, wherein the third code stream is used for storage and transmission; performing third entropy decoding on the third code stream to obtain a decoding point cloud; The third entropy coding and the third entropy decoding are configured with a second entropy model for modeling the distribution of the characteristic residual errors; and carrying out residual fusion on the decoding point cloud and the fusion alignment feature of the first layer-1 to obtain the reconstruction point cloud of the first layer-1.
- 7. The method according to claim 1, wherein the reconstructing the point cloud according to layer 1 is based on a first multi-layer upsampling process to obtain a final reconstructed point cloud, comprising the steps of: processing the reconstructed point cloud of the layer 1 through a plurality of lossy compression modules to obtain an initial reconstructed point cloud; wherein, the lossy compression module is configured with a generating up-sampling of a preset multiple; performing two classifications on the initial reconstruction point cloud to obtain classification results; and removing non-occupied points from the initial reconstruction point cloud by using a pruning technology based on the classification result to obtain a final reconstruction point cloud.
- 8. A dynamic point cloud geometry compression device based on spatiotemporal feature enhancement, the device comprising: the system comprises a first module, a second module and a third module, wherein the first module is used for acquiring point cloud data, and the point cloud data comprises a reference point cloud and an original point cloud; the second module is used for carrying out first multi-layer downsampling on the reference point cloud and the original point cloud to correspondingly obtain a layer 1 reference compression point cloud and a layer 1 original compression point cloud; The third module is used for carrying out second multi-layer downsampling on the reference compression point cloud of the layer 1 and the original compression point cloud of the layer 1 to correspondingly obtain a reference compression point cloud set and an original compression point cloud set, wherein the reference compression point cloud set comprises the reference compression point clouds of the layers 2 to L, and the original compression point cloud set comprises the original compression point clouds of the layers 2 to L; a fourth module, configured to perform a first entropy encoding and decoding on the original compressed point cloud of the L layer, to obtain a reconstructed point cloud of the L layer; A fifth module for initializing L to L; A sixth module, configured to perform primary upsampling on the reconstructed point cloud of the first layer to obtain an expanded point cloud of the first-1 layer, and perform geometric encoding and decoding based on the expanded point cloud of the first-1 layer and the original compressed point cloud of the first-1 layer to obtain a coarse reconstructed point cloud of the first-1 layer; A seventh module, configured to perform time domain modeling by using an offset guided interpolation technique based on the coarse reconstruction point cloud of the layer 1 and the reference compression point cloud of the layer 1, to obtain a layer 1 fusion alignment feature; an eighth module, configured to perform a second entropy encoding and decoding based on residual error generation based on the fusion alignment feature of the layer 1 and the original compressed point cloud of the layer 1, to obtain the reconstructed point cloud of the layer 1; A ninth module, configured to perform a decrementing operation on l, and if l is not equal to 1, return to perform the operation of the sixth module until l is equal to 1; and a tenth module, configured to obtain a final reconstructed point cloud based on the first multi-layer upsampling process according to the reconstructed point cloud of the layer 1.
- 9. An electronic device comprising a memory storing a computer program and a processor implementing the method of any of claims 1 to 7 when the computer program is executed by the processor.
- 10. A computer program product, characterized in that it comprises a computer program which, when executed by a processor, implements the method of any one of claims 1 to 7.
Description
Dynamic point cloud geometric compression method based on space-time feature enhancement and related equipment Technical Field The invention relates to the technical field of point cloud data processing, in particular to a dynamic point cloud geometric compression method based on space-time feature enhancement and related equipment. Background With the popularization of three-dimensional visual applications such as virtual reality and autopilot, the dynamic point cloud is taken as an important representation form of a three-dimensional scene and an object, and efficient compression of the dynamic point cloud becomes a key technical challenge. The existing dynamic point cloud compression scheme is mainly divided into two major categories of traditional coding standards and end-to-end methods based on deep learning. Conventional coding standards are represented by G-PCC and V-PCC released by MPEG. The G-PCC directly utilizes data structures such as octree and the like to encode geometry in a three-dimensional space, and the V-PCC projects point cloud to a two-dimensional plane and then utilizes video encoding standards to compress the point cloud. These methods rely on a large number of manually designed modules and heuristic rules, which are difficult to globally optimize and have bottlenecks in compression performance. The end-to-end method based on deep learning has become a research hot spot in recent years. The D-DPCC, LDPCC, unicorn scheme based on three-dimensional sparse convolution is excellent in geometrical compression of dense dynamic point clouds. They map the point cloud to hidden space through neural networks and introduce temporal modeling to exploit inter-frame correlation. However, the existing scheme still has two main core defects, namely 1) an explicit motion estimation and compensation (MEMC) method is usually carried out on a single scale, the calculation is complex and multi-scale time domain information is difficult to fuse, and 2) an implicit time domain modeling method based on a fixed convolution kernel has a limited receptive field and is difficult to effectively adapt to a complex and non-rigid motion mode in a point cloud. In addition, the loss of geometric information may lead to reduced reconstruction quality when high compression rates are pursued by existing lossy compression methods, lacking a mechanism to ensure accurate reconstruction of critical geometries. Disclosure of Invention The embodiment of the invention mainly aims to provide a dynamic point cloud geometric compression method, a device, electronic equipment, a storage medium and a program product based on space-time feature enhancement, and aims to solve at least one problem in the prior art. In order to achieve the above objective, an aspect of the embodiments of the present invention provides a dynamic point cloud geometric compression method based on space-time feature enhancement, where the method includes: acquiring point cloud data, wherein the point cloud data comprises a reference point cloud and an original point cloud; Performing first multi-layer downsampling on the reference point cloud and the original point cloud to correspondingly obtain a layer 1 reference compression point cloud and a layer 1 original compression point cloud; Performing second multi-layer downsampling on the layer 1 reference compression point cloud and the layer 1 original compression point cloud to correspondingly obtain a reference compression point cloud set and an original compression point cloud set, wherein the reference compression point cloud set comprises layer 2 to layer L reference compression point clouds, and the original compression point cloud set comprises layer 2 to layer L original compression point clouds; Performing first entropy encoding and decoding on the L-layer original compressed point cloud to obtain an L-layer reconstructed point cloud; Initializing L to L; performing primary up-sampling on the first layer reconstruction point cloud to obtain a first-1 layer expansion point cloud, and performing geometric coding and decoding based on the first-1 layer expansion point cloud and a first-1 layer original compression point cloud to obtain a first-1 layer coarse reconstruction point cloud; Performing time domain modeling by using an offset guided interpolation technology based on the first-1 layer coarse reconstruction point cloud and the first-1 layer reference compression point cloud to obtain a first-1 layer fusion alignment feature; Performing second entropy coding and decoding based on residual error generation based on the first-1 layer fusion alignment feature and the first-1 layer original compression point cloud to obtain a first-1 layer reconstruction point cloud; Performing decrementing operation on l, if l is not equal to 1, returning to execute the step of performing one-time upsampling on the first layer reconstruction point cloud until l is equal to 1; And according to the layer 1 reconstru