CN-122017874-A - Map generation method and device

CN122017874ACN 122017874 ACN122017874 ACN 122017874ACN-122017874-A

Abstract

The application discloses a map generation method and equipment, wherein the method converts data into a high-precision map by acquiring equipment information such as a camera, a laser radar, a GPS (global positioning system) and the like and automatically generates a high-precision two-dimensional map and a high-fidelity three-dimensional map by utilizing three-dimensional reconstruction to construct an immersive virtual environment. The two-dimensional map enables a vehicle to understand lane configuration and traffic regulations by providing necessary road topology structure and semantic information, and becomes a key foundation of safety navigation. The three-dimensional map comprises high-fidelity three-dimensional rendering of road environments of terrains, buildings and other infrastructures, and the simulation environment is optimized by enhancing depth sense and sense of reality, so that the vehicle perception capability and the interactive test with dynamic elements such as pedestrians, other vehicles and the like are more comprehensive, and the driving environment is truly reproduced. The double-layer simulation capability remarkably improves the testing and developing efficiency of the automatic driving technology, and promotes the development of a safer and more reliable automatic driving system.

Inventors

YE ZHONGWEN
PAN TAO
QI XIAOJUAN
WU HANFEI
LIU YUANXIANG
DONG ZHEN
YAO JUNWEI

Assignees

香港生产力促进局
香港大学

Dates

Publication Date: 20260512
Application Date: 20251202

Claims (8)

1. A map generation method, the method comprising: acquiring images of a multi-view camera, laser radar point clouds and inertial measurement unit data; Extracting features of the images of the multi-view camera to obtain a feature map of the images of the multi-view camera; converting the image feature map of the multi-view camera into aerial view image features; dividing the laser radar point cloud into vertical columns, and calculating the statistical characteristics of each column to obtain the aerial view laser radar characteristics of each column; splicing the aerial view image features and the aerial view laser radar features to obtain fused aerial view features; Performing end-to-end detection on the fused aerial view features by using a DETR-like map element detector, and performing label distribution through binary matching to obtain a learnable element query vector and a learnable key point query vector, wherein each learnable element query vector corresponds to one map element, the map element comprises a road boundary, a lane separation line and a zebra crossing, and the learnable key point query vector corresponds to the key point position and the geometric shape of the map element; Taking the learnable element query vector as the input of a decoder to obtain an example representation of each map element; Taking the learnable key point query vector as the input of a decoder to obtain the key point representation of each map element; based on the example representation and the key point representation of each map element, sorting and connecting the key points to generate an ordered broken line; coordinate embedding, position embedding and value embedding are carried out on the ordered folding lines, so that high-precision map folding lines are obtained; Calculating the geometric characteristics of each point in the laser radar point cloud, wherein the geometric characteristics comprise the distance from each point to a voxel plane and the included angle between the normal vector of the point and the normal vector of the voxel plane; associating pixel blocks to lidar points based on pixel blocks from different frames of the multi-view camera image and geometric features from the point cloud, forming a textured point cloud; Converting the textured point cloud into a COLMAP database in COLMAP database format; Based on the COLMAP database, performing nerve rendering by adopting a three-dimensional Gaussian sputtering technology to obtain a three-dimensional reconstruction model; Aligning the position in the broken line of the high-precision map with the position in the three-dimensional reconstruction model through the data of the inertial measurement unit to obtain an aligned high-precision map; And carrying out post-processing on the aligned high-precision map to obtain a final vectorized high-precision map and a high-fidelity three-dimensional reconstruction map.
2. The method of claim 1, wherein the aligning comprises aligning the position in the high-precision map polyline with the position in the three-dimensional reconstruction model via inertial measurement unit data and acquired global navigation satellite system data.
3. The method of claim 1, further comprising aligning the multi-view camera image, the lidar point cloud, the inertial measurement unit data, and the global navigation satellite system data using hardware time synchronization.
4. The method of claim 1, wherein the converting the multi-view camera image feature map to a bird's eye view image feature comprises: projecting the image features of the multi-view camera to a three-dimensional space by a learnable priori or geometric priori; and further projecting the multi-view camera image features projected in the three-dimensional space onto the aerial view grid to obtain aerial view image features.
5. The method of claim 1, wherein the distance of the point to its voxel plane is calculated by: -constructing a vector from the planar point to the lidar point; -calculating a dot product of the constructed vector and the planar normal vector; Taking the absolute value of the dot product as the distance from the point to its voxel plane.
6. The method of claim 1, wherein the angle between the point normal vector and the voxel plane normal vector is calculated by: -calculating a dot product of the dot normal vector and the planar normal vector; -taking the inverse cosine value of the calculated dot product to obtain the angle between the two normal vectors.
7. The method of claim 1, wherein the post-processing includes one or more of denoising, smoothing, semantic labeling.
8. A computer device comprising a processor, a memory and a computer program stored on the memory, characterized in that the processor executes the computer program to carry out the steps of the method of any one of claims 1-7.

Description

Map generation method and device Technical Field The application relates to the field of automatic driving, in particular to a high-precision and high-fidelity map generation method suitable for automatic driving. Background Existing map generation methods often have difficulty capturing complex road details and real environment features with precision, and these elements are critical to efficient simulation and navigation. The techniques described in the patents CN117635850B and CN113834492B rely mainly on single sensor data (lidar or image), resulting in incomplete map output or insufficient reliability due to lack of data integration. These systems often fail to integrate multimodal data, which is a key element in achieving the high precision and fidelity required for autopilot applications. The reliance on single sensor data limits the ability of the system to capture the complexity of the real environment, resulting in gaps in the mapping process, incomplete map output, or insufficient reliability. Furthermore, most existing methods provide only two-dimensional map representations, lacking real-time three-dimensional scene reconstruction capabilities. This limitation significantly weakens its ability to build an immersive realistic simulation environment that is necessary for full testing and verification of the autopilot system. Therefore, it is desirable to provide a map generation method that can accurately capture complex road details and real environment features. Disclosure of Invention The invention aims to provide a map generation method which can simulate fine and accurate scene presentation of an environment and generate an immersive real simulation environment, so that the map generation method is suitable for testing autonomous systems such as an automatic driving automobile, an unmanned aerial vehicle and a robot in a virtual environment. In a first aspect of the present invention, there is provided a map generation method, the method comprising: acquiring images of a multi-view camera, laser radar point clouds and inertial measurement unit data; Extracting features of the images of the multi-view camera to obtain a feature map of the images of the multi-view camera; converting the image feature map of the multi-view camera into aerial view image features; dividing the laser radar point cloud into vertical columns, and calculating the statistical characteristics of each column to obtain the aerial view laser radar characteristics of each column; splicing the aerial view image features and the aerial view laser radar features to obtain fused aerial view features; Performing end-to-end detection on the fused aerial view features by using a DETR-like map element detector, and performing label distribution through binary matching to obtain a learnable element query vector and a learnable key point query vector, wherein each learnable element query vector corresponds to one map element, the map element comprises a road boundary, a lane separation line and a zebra crossing, and the learnable key point query vector corresponds to the key point position and the geometric shape of the map element; Taking the learnable element query vector as the input of a decoder to obtain an example representation of each map element; Taking the learnable key point query vector as the input of a decoder to obtain the key point representation of each map element; based on the example representation and the key point representation of each map element, sorting and connecting the key points to generate an ordered broken line; coordinate embedding, position embedding and value embedding are carried out on the ordered folding lines, so that high-precision map folding lines are obtained; Calculating the geometric characteristics of each point in the laser radar point cloud, wherein the geometric characteristics comprise the distance from each point to a voxel plane and the included angle between the normal vector of the point and the normal vector of the voxel plane; associating pixel blocks to lidar points based on pixel blocks from different frames of the multi-view camera image and geometric features from the point cloud, forming a textured point cloud; Converting the textured point cloud into a COLMAP database in COLMAP database format; Based on the COLMAP database, performing nerve rendering by adopting a three-dimensional Gaussian sputtering technology to obtain a three-dimensional reconstruction model; Aligning the position in the broken line of the high-precision map with the position in the three-dimensional reconstruction model through the data of the inertial measurement unit to obtain an aligned high-precision map; And carrying out post-processing on the aligned high-precision map to obtain a final vectorized high-precision map and a high-fidelity three-dimensional reconstruction map. In a second aspect of the invention, there is provided a computer device comprising a processor, a memory and a computer pro