CN-112912890-B - Method and system for generating synthetic point cloud data using a generation model
Abstract
Methods and systems for generating synthetic point cloud data are described. The projected 2D data grid is generated by projecting the 3D point cloud into a 2D grid with rotational invariance. The projected 2D data grid is used to learn a generative model, where the generative model is implemented using, for example, flexible convolution and transpose flexible convolution operations in a generative reactance network. The learned generation model is used to generate a synthetic point cloud.
Inventors
- Lucas puggia
- Joel Pinot
- Almira amilo aborfati
Assignees
- 华为技术有限公司
- 华为技术有限公司
- 皇家学习促进研究所/麦吉尔大学
- 皇家学习促进研究所/麦吉尔大学
Dates
- Publication Date
- 20260421
- Application Date
- 20190914
- Priority Date
- 20190914
Claims (17)
- 1. A method, comprising: Acquiring a first batch of point clouds representing a three-dimensional 3D environment; generating a projected 2D data grid for each of the first plurality of point clouds by projecting each of the first plurality of point clouds into a projected two-dimensional 2D data grid having rotational metamerism, and In a training phase, a generation model is learned for generating one or more batches of synthetic point clouds, the generation model is learned by providing a projected 2D data grid for each point cloud of a first batch for the generation model, wherein the generation model comprises a flexible convolution operation and a transpose flexible convolution operation, wherein the generating the projected 2D data grid further comprises: Wrapping elements of the projected 2D data grid from one edge of the projected 2D data grid to an opposite edge to populate the projected 2D data grid, and In the training phase, a model is generated using the projected 2D data mesh learning after filling.
- 2. The method as recited in claim 1, further comprising: one or more batches of synthetic point clouds are generated from the data sampled from the distribution using the learned generation model.
- 3. The method of claim 1, wherein each point cloud in the first set of point clouds is projected onto the projected 2D data grid using the formula: Where x, y, and z represent 3D coordinates of data points in the point cloud.
- 4. A method according to claim 3, wherein the projected 2D data grid is filled by adding an added leftmost column containing elements from the original rightmost column of the projected 2D data grid, and adding an added rightmost column containing elements from the original leftmost column of the projected 2D data grid, wherein the number of filled columns is determined by the size of the convolution kernel.
- 5. The method of claim 4, wherein each row in the projected 2D data grid corresponds to a respective closed loop in the point cloud.
- 6. The method according to any one of claims 1 to 5, further comprising: The synthesized point cloud of the one or more batches is used to supplement any missing data points from the point cloud in the first batch.
- 7. The method according to any one of claims 1 to 5, further comprising: Combining the generated model with a sequence model, wherein the sequence model generates synthesis time data that is used to predict any missing data points in the first batch of point clouds.
- 8. The method according to any one of claims 1 to 5, wherein the learned generation model is a recurrent neural network RNN.
- 9. The method as recited in claim 1, further comprising: One or more batches of synthetic point clouds are generated from the data sampled from the distribution according to the actions of the autonomous device using the learned generation model.
- 10. A processing unit, comprising: A processor; A memory storing computer-executable instructions that, when executed by the processor, cause the processing unit to perform the method of any one of claims 1 to 9.
- 11. A computer readable memory storing instructions which, when executed by a processor of a processing unit, cause the processing unit to perform the method of any of claims 1 to 9.
- 12. A method, comprising: Acquiring a first batch of point clouds representing a 3D environment; generating a projected 2D data grid for each of the first plurality of point clouds by projecting each of the first plurality of point clouds into the projected 2D data grid having rotational metamerism, and In a training phase, learning a generation model for generating one or more batches of synthetic point clouds, learning the generation model by providing a first batch of point clouds to the generation model, wherein a projected 2D data grid is used to identify nearest neighbors for performing a flexible convolution operation and a transpose flexible convolution operation during learning, wherein the generating the projected 2D data grid further comprises: Wrapping elements of the projected 2D data grid from one edge of the projected 2D data grid to an opposite edge to populate the projected 2D data grid, and In the training phase, a model is generated using the projected 2D data mesh learning after filling.
- 13. The method as recited in claim 12, further comprising: one or more batches of synthetic point clouds are generated from the data sampled from the distribution using the learned generation model.
- 14. The method according to claim 12 or 13, further comprising: the generation model is combined with a sequence model, wherein the sequence model captures time data that is used to predict any missing data points from the first batch of point clouds.
- 15. The method of claim 12, wherein the learned generation model is a recurrent neural network.
- 16. A processing unit, comprising: A processor; A memory storing computer-executable instructions that, when executed by the processor, cause the processing unit to perform the method of any one of claims 12 to 15.
- 17. A computer readable memory storing instructions which, when executed by a processor of a processing unit, cause the processing unit to perform the method of any of claims 12 to 15.
Description
Method and system for generating synthetic point cloud data using a generation model Technical Field The present application relates to a system and method for learning a generation model for generating synthetic point cloud data, and a system and method for generating synthetic point cloud data from data selected in a distribution using the learned generation model. Background In many autonomous tasks, knowledge of the environment plays a key role. The success of an autonomous device (e.g., a robot or vehicle) in performing an autonomous task depends on robust sensory data input and algorithms for processing the sensory data. In many cases, the sensory data is noisy or some sensory data is missing. To be able to handle this situation, autonomous devices need to be able to "understand" such sensory data. Humans have this capability. For example, if some pixels in some video frames are lost, a human can easily "predict/imagine" the lost data (e.g., generate those lost samples in its brain based on it having temporal and spatial information about those pixels) and still have the same perception of the video. Furthermore, many autonomous tasks require testing and even training in a simulator environment, as training and testing autonomous devices in a real environment may be difficult, e.g., may be costly and/or low in safety (e.g., in the case of autonomous driving). However, many conventional simulators are not capable of providing real sensory data for autonomous devices. As a result, autonomous devices trained and tested in simulators may not operate well in a real environment. Thus, it is desirable to build simulators that can produce more realistic data. In order to solve the above-described problems, an effective method for generating synthetic data is required. In recent years, researchers have successfully used generative models to generate image and video data, such as that described in Goodfulow et al (progress of neuro information processing systems, pages 2672-2680, 2014), and Zhu et al (arXiv preprint, 2017). However, in most cases, autonomous devices require three-dimensional (3D) understanding of the real environment in order to operate well, which relies on accurate 3D sensory data (e.g., in the form of a point cloud). Generating a point cloud is a challenging task at the present time. The generative model is a type of machine learning method whose purpose is to generate samples from the same distribution as the training data. The generative model is of different types, such as a variational self-encoder (VAE) and a Generative Antagonism Network (GAN). Many generation models that use convolutional neural networks (convolution neural network, CNN) in their architecture require deconvolution operations (e.g., in a decoder in the VAE or in a generator in the GAN environment). For conventional convolutions, the operation is a transposed convolution. Many deep learning-based methods capture local features in an irreversible manner. Accordingly, there is a need for a system and method for generating synthetic data that addresses at least some of the above-mentioned problems. Disclosure of Invention The application provides a method and a system for generating synthetic data in the form of a point cloud. The disclosed methods and systems learn a generative model from point clouds during a training phase and generate a synthetic point cloud from the learned generative model during an inference phase. In some examples, the disclosed methods and systems may utilize a deep neural network (deep neural network, DNN) architecture. Methods and systems for ordering and projecting data points from a point cloud into a grid-based data structure are also described that facilitate increasing the efficiency of learning to generate models during a training phase. According to one aspect of the application, a method is provided that includes obtaining a first set of point clouds representing a 3D environment, generating a projected two-dimensional (2D) data grid for each of the first set of point clouds by projecting each of the first set of point clouds into the projected 2D data grid with rotational invariance, learning a generation model that generates one or more sets of synthetic point clouds during a training phase, learning the generation model by providing the generation model with the projected 2D data grid for each of the first set of point clouds, wherein the generation model includes a flexible convolution operation and a transpose flexible convolution. According to the foregoing aspect, the method further comprises generating one or more batches of synthetic point clouds from data sampled from a distribution using the learned generation model. According to any preceding aspect, the generating the projected 2D data grid further comprises populating the projected 2D data grid by wrapping elements of the grid from one edge of the projected 2D data grid to an opposite edge, and learning the populated genera