CN-121986359-A - Method, apparatus and medium for point cloud processing

CN121986359ACN 121986359 ACN121986359 ACN 121986359ACN-121986359-A

Abstract

Embodiments of the present disclosure provide a solution for point cloud processing. A method for point cloud processing is presented. The method includes determining, for a transition between a current Point Cloud (PC) sample of a point cloud sequence and a bit stream of the point cloud sequence, a prediction of a current feature for the current PC sample based on a current sparse PC sample of the current PC sample, a sparse PC sample set of at least one reference PC sample of the current PC sample, and a feature set, and performing the transition based on the prediction of the current feature.

Inventors

Zhou Diechen
XU YINGZHAN
ZHANG KAI
ZHANG LI
ZHANG XINFENG

Assignees

抖音视界有限公司
字节跳动有限公司

Dates

Publication Date: 20260505
Application Date: 20241006
Priority Date: 20231007

Claims (20)

1. A method for point cloud processing, comprising: Determining a prediction of a current feature for a current Point Cloud (PC) sample of a point cloud sequence based on a current sparse PC sample of the current PC sample, a sparse PC sample set of at least one reference PC sample of the current PC sample, and a feature set for a transition between the current PC sample and a bit stream of the point cloud sequence, and The conversion is performed based on the prediction for the current feature.
2. The method of claim 1, wherein the at least one reference PC sample comprises a single reference PC sample.
3. The method of claim 2, wherein a downsampling process with at least one stage is applied to the single reference PC samples, the sparse PC sample set includes sparse PC samples of the single reference PC samples generated at a last stage of the at least one stage, and the feature set includes features of the single reference PC samples generated at the last stage.
4. The method of claim 3, wherein the prediction for the current feature is determined by directly using the current sparse PC sample of the current PC sample, the sparse PC sample of the single reference PC sample, and the feature.
5. The method of claim 3, wherein the prediction for the current feature is determined by performing a refined secondary prediction process based on the current sparse PC sample of the current PC sample, the sparse PC sample of the single reference PC sample, and the feature.
6. The method of claim 3, wherein determining the prediction for the current feature comprises: Generating an initial prediction for the current feature based on the current sparse PC sample of the current PC sample, the sparse PC samples of the single reference PC sample, and the feature, and A second prediction for the current feature is generated based on the initial prediction for the current feature and the feature of the single reference PC sample to obtain the prediction for the current feature.
7. The method of any of claims 3-6, wherein the prediction for the current feature is determined using a first model based on a Neural Network (NN).
8. The method of claim 7, wherein the NN-based first model includes at least one sparse convolution or at least one sparse convolution on target coordinates.
9. The method of claim 2, wherein a downsampling process with multiple phases is applied to the single reference PC sample, and the sparse PC sample set includes more than one sparse PC sample of the single reference PC sample generated at more than one of the multiple phases, and the feature set includes more than one feature of the single reference PC sample generated at the more than one phase.
10. The method of claim 9, wherein sparse PC samples and features of the single reference PC sample generated at a first stage of the more than one stages are downsampled, the downsampled sparse PC samples and the downsampled features are aligned and fused with sparse PC samples and features of the single reference PC sample generated at a second stage of the more than one stages, respectively, and the second stage follows the first stage.
11. The method of claim 10, wherein the prediction for the current feature is determined based on the current sparse PC sample of the current PC sample, a fused sparse PC sample of the single reference PC sample at a last one of the more than one phases, and a fused feature.
12. The method of any of claims 9-11, wherein the prediction for the current feature is determined using a first model based on a Neural Network (NN).
13. The method of claim 12, wherein the NN-based first model includes at least one sparse convolution or at least one sparse convolution on target coordinates.
14. The method of claim 1, wherein the at least one reference PC sample comprises a plurality of reference PC samples.
15. The method of claim 14, wherein the number of the plurality of reference PC samples is 2.
16. The method of any of claims 14-15, wherein a downsampling process with at least one phase is applied to the plurality of reference PC samples, the sparse PC sample set includes sparse PC samples of the plurality of reference PC samples generated at a last phase of the at least one phase, and the feature set includes features of the plurality of reference PC samples generated at the last phase.
17. The method of claim 16, wherein the prediction for the current feature is determined by directly using the current sparse PC sample of the current PC sample, the sparse PC samples of the plurality of reference PC samples, and the feature.
18. The method of claim 17, wherein determining the prediction for the current feature comprises: Fusing the sparse PC samples of the plurality of reference PC samples; fusing the features of the plurality of reference PC samples, and The prediction for the current feature is generated based on the current sparse PC sample of the current PC sample, the fused sparse PC sample, and the fused feature.
19. The method of claim 16, wherein the prediction for the current feature is determined by performing a refined secondary prediction process based on the current sparse PC sample of the current PC sample, the sparse PC samples of the plurality of reference PC samples, and the feature.
20. The method of claim 16, wherein the plurality of reference PC samples comprises a first reference PC sample and a second reference PC sample, and determining the prediction for the current feature comprises: generating a first prediction for the current feature based on the current sparse PC sample of the current PC sample, a sparse PC sample of the first reference PC sample, and a feature; Generating a second prediction for the current feature based on the current sparse PC sample of the current PC sample, a sparse PC sample of the second reference PC sample, and a feature, and The prediction for the current feature is generated based on a result of fusing the first prediction and the second prediction.

Description

Method, apparatus and medium for point cloud processing Technical Field Embodiments of the present disclosure relate generally to point cloud processing technology, and more particularly, to reference frame based dynamic point cloud codec. Background A point cloud is a collection of individual data points in a three-dimensional (3D) plane, where each point has set coordinates in the X, Y, and Z axes. Thus, the point cloud may be used to represent the physical content of a three-dimensional space. For a variety of immersive applications, from augmented reality to autopilot, point clouds have proven to be a promising way to represent 3D visual data. The point cloud codec standard has evolved mainly through the development of the well-known MPEG organization. MPEG is an acronym for the moving picture expert group (Moving Picture Experts Group), which is one of the main standardization group that deals with multimedia. In 2017, the MPEG 3D graphic codec group (3 DG) published a proposal set (CFP) file to begin developing point cloud codec standards. The final criteria will encompass two categories of solutions. Video-based point cloud compression (V-PCC or VPCC) is applicable to a set of points where the distribution of points is relatively uniform. Geometry-based point cloud compression (G-PCC or GPCC) is suitable for more sparse distributions. However, it is generally desirable to further improve the codec efficiency of conventional point cloud codec techniques. Disclosure of Invention Embodiments of the present disclosure provide a solution for point cloud processing. In a first aspect, a method for point cloud processing is presented. The method includes determining, for a transition between a current Point Cloud (PC) sample of a point cloud sequence and a bit stream of the point cloud sequence, a prediction of a current feature for the current PC sample based on a current sparse PC sample of the current PC sample, a sparse PC sample set of at least one reference PC sample of the current PC sample, and a feature set, and performing the transition based on the prediction of the current feature. Based on the method according to the first aspect of the present disclosure, the prediction of the current feature for the current PC sample is determined based on the current sparse PC sample of the current PC sample, the sparse PC sample set of at least one reference PC sample of the current PC sample, and the feature set. Further, the current PC sample is encoded based on the prediction for the current feature. Compared to conventional solutions, the proposed method can advantageously exploit the information of the decoded PC samples in order to reduce redundancy between the information of the current PC sample and the information of the decoded PC samples. In this way, the codec efficiency can be improved. In a second aspect, an apparatus for point cloud processing is presented. The apparatus includes a processor and a non-transitory memory having instructions thereon. The instructions, when executed by a processor, cause the processor to perform a method according to the first aspect of the present disclosure. In a third aspect, a non-transitory computer readable storage medium is presented. The non-transitory computer readable storage medium stores instructions that cause a processor to perform a method according to the first aspect of the present disclosure. In a fourth aspect, another non-transitory computer readable recording medium is presented. The non-transitory computer readable recording medium stores a bit stream of a point cloud sequence generated by a method performed by an apparatus for point cloud processing. The method includes determining a prediction of a current feature for a current PC sample based on the current sparse PC sample of the current PC sample of the point cloud sequence, a sparse PC sample set of at least one reference PC sample of the current PC sample, and a feature set, and generating a bitstream based on the prediction of the current feature. In a fifth aspect, a method for storing a bit stream of a point cloud sequence is presented. The method includes determining a prediction of a current feature for a current PC sample based on a current sparse PC sample of a current PC sample of the point cloud sequence, a sparse PC sample set of at least one reference PC sample of the current PC sample, and a feature set, generating a bitstream based on the prediction of the current feature, and storing the bitstream in a non-transitory computer readable recording medium. This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter. Drawings The above and other objects, features and advantages of the exemplary embodiments of