CN-116721249-B - Fusion labeling method based on neural network and related equipment

CN116721249BCN 116721249 BCN116721249 BCN 116721249BCN-116721249-B

Abstract

The application discloses a fusion labeling method based on a neural network and related equipment, wherein the method comprises the steps of acquiring point cloud acquisition parameters of point cloud data, image acquisition parameters of image data and calibration parameters between the point cloud data and the image data; and labeling the labeling frame of the target object in the image data based on the point cloud acquisition parameters, the image acquisition parameters, the pseudo 3D frame, the calibration parameters and the trained image labeling model. The image annotation model in the application can learn the information required by mapping the accurate pseudo 3D frame into the accurate annotation frame from the point cloud acquisition parameters, the image acquisition parameters, the pseudo 3D frames and the calibration parameters, and generates the annotation frame based on the information, thereby avoiding the influence of error accumulation of physical equipment on the accuracy of the annotation frame, avoiding manual adjustment of each annotation frame determined based on the pseudo 3D frames by an annotator and reducing the annotation cost of fusion annotation.

Inventors

Liang Sishuo
LU SIYU
ZHANG XIA

Assignees

重庆长安汽车股份有限公司

Dates

Publication Date: 20260505
Application Date: 20230630

Claims (9)

1. A fusion labeling method based on a neural network, the method comprising: Acquiring point cloud data and image data through preset acquisition equipment, wherein the preset acquisition equipment is configured with the point cloud acquisition equipment and the image acquisition equipment; acquiring point cloud acquisition parameters of the point cloud data, image acquisition parameters of the image data and calibration parameters between the point cloud acquisition equipment and the image acquisition equipment, wherein the point cloud acquisition parameters comprise a point cloud time stamp and a preset acquisition equipment moving speed, and the image acquisition parameters comprise an image time stamp and the image acquisition equipment parameters; determining a pseudo 3D frame of a target object in the image data based on the point cloud data; Labeling a labeling frame of a target object in the image data based on the point cloud acquisition parameters, the image acquisition parameters, the pseudo 3D frame, the calibration parameters and a trained image labeling model, wherein the image labeling model learns characteristic information required for mapping the pseudo 3D frame to the labeling frame of the target object so as to avoid the influence of error accumulation of physical equipment on the accuracy of the labeling frame; Wherein the determining the pseudo 3D frame of the target object in the image data based on the point cloud data specifically includes: Performing target detection on the point cloud data to obtain a 3D target frame corresponding to the target object; and projecting the 3D target frame onto the image data to obtain a pseudo 3D frame of the target object.
2. The fusion labeling method based on the neural network according to claim 1, wherein the labeling frame for labeling the target object in the image data based on the point cloud acquisition parameter, the image acquisition parameter, the pseudo 3D frame, the calibration parameter and the trained image labeling model specifically comprises: Determining a time difference based on the point cloud timestamp and the image timestamp; And inputting the time difference, the moving speed, the pseudo 3D frame, the equipment parameters and the calibration parameters into a trained image annotation model, and outputting an annotation frame of a target object in the image data through the image annotation model.
3. The fusion labeling method based on a neural network according to claim 1, wherein after labeling the labeling frame of the target object in the image data based on the point cloud acquisition parameters, the image acquisition parameters, the pseudo 3D frame, the calibration parameters, and the trained image labeling model, the method further comprises: receiving a correction operation for correcting the annotation frame; If a correction operation is received, correcting the marking frame based on the correction operation, and taking the corrected marking frame as the marking frame of the target object in the image data; And if the correction operation is not received, keeping the marking frame unchanged.
4. The fusion labeling method based on the neural network according to claim 1, wherein the training process of the image labeling model specifically comprises: acquiring a training data set, wherein the training data set comprises a plurality of training data sets, and each training data set in the plurality of training data sets comprises a training pseudo-3D frame, training point cloud acquisition parameters, training image acquisition parameters, training calibration parameters and a true value labeling frame; determining a prediction annotation frame based on a training pseudo 3D frame, training point cloud acquisition parameters, training image acquisition parameters, training calibration parameters and an initial network model corresponding to the image annotation model in a training data set; And determining a loss term based on the prediction annotation frame and a true value annotation frame in the training data set, and training the initial network model based on the loss term to obtain a trained image annotation model.
5. The neural network-based fusion labeling method of claim 4, wherein the acquiring a training dataset specifically comprises: Acquiring point cloud data and image data through preset acquisition equipment to obtain a plurality of point cloud data and a plurality of image data; Synchronizing the plurality of point cloud data and the plurality of image data to obtain a plurality of data sets, wherein each data set in the plurality of data sets comprises the point cloud data and the image data; Acquiring training point cloud acquisition parameters of point cloud data in each data set, training image acquisition parameters of image data and training calibration parameters, determining a training pseudo 3D frame based on the point cloud data in the data set, and determining a true value labeling frame based on the image data in the data set; and for each data set, constructing a training data set based on the training point cloud acquisition parameters, the training image acquisition parameters, the training pseudo 3D frame, the true value labeling frame and the training calibration parameters corresponding to the data set so as to obtain a training data set.
6. The neural network-based fusion labeling method according to claim 4, wherein after determining the prediction labeling frame based on the training pseudo 3D frame, the training point cloud acquisition parameters, the training image acquisition parameters, the training calibration parameters and the initial network model corresponding to the image labeling model in the training data set, the method further comprises: Performing correction operation on the prediction annotation frame to obtain a correction annotation frame; And replacing a true value labeling frame in the training data set by the correction labeling frame to form an updated training data set, and adding the updated training data set into the training data set.
7. The fusion labeling device based on the neural network is characterized by comprising: The data acquisition module is used for acquiring point cloud data and image data through preset acquisition equipment, wherein the preset acquisition equipment is configured with the point cloud acquisition equipment and the image acquisition equipment; The acquisition module is used for acquiring point cloud acquisition parameters of the point cloud data, image acquisition parameters of the image data and calibration parameters between the point cloud acquisition equipment and the image acquisition equipment, wherein the point cloud acquisition parameters comprise a point cloud time stamp and a preset acquisition equipment moving speed, and the image acquisition parameters comprise an image time stamp and the image acquisition equipment parameters; a determining module for determining a pseudo 3D frame of a target object in the image data based on the point cloud data; The marking module is used for marking the marking frame of the target object in the image data based on the point cloud acquisition parameters, the image acquisition parameters, the pseudo 3D frame, the calibration parameters and the trained image marking model, wherein the image marking model learns characteristic information required by mapping the pseudo 3D frame to the marking frame of the target object so as to avoid the influence of error accumulation of physical equipment on the accuracy of the marking frame; Wherein the determining the pseudo 3D frame of the target object in the image data based on the point cloud data specifically includes: Performing target detection on the point cloud data to obtain a 3D target frame corresponding to the target object; and projecting the 3D target frame onto the image data to obtain a pseudo 3D frame of the target object.
8. A terminal device, characterized in that the terminal device comprises a memory, a processor and a neural network based fusion labeling method program stored in the memory and executable on the processor, the processor implementing the steps of the neural network based fusion labeling method according to any of claims 1-6 when executing the neural network based fusion labeling method program.
9. A computer readable storage medium, wherein a neural network based fusion labeling method program is stored on the computer readable storage medium, and when the neural network based fusion labeling method program is executed by a processor, the steps of the neural network based fusion labeling method according to any of claims 1-6 are implemented.

Description

Fusion labeling method based on neural network and related equipment Technical Field The application relates to the technical field of artificial intelligence, in particular to a fusion labeling method based on a neural network and related equipment. Background With the rapid development of the automotive industry, autopilot has become a necessary trend in vehicle development. In the automatic driving research process, a large amount of point cloud data, image data and corresponding labeling true values are required. The marking method commonly used at present carries out fusion marking according to the physical parameters of the point cloud equipment corresponding to the point cloud data and the image equipment corresponding to the image data, so that the marking method has extremely high requirements on the calibration parameter requirements and the errors of the physical acquisition equipment. However, in practical application, it is difficult to avoid error accumulation caused by the physical device, so that accuracy of the labeling frames obtained by fusion labeling is low, a labeling person is required to manually adjust each labeling frame, and labeling cost of the labeling frames is increased. Accordingly, there is a need for improvement and advancement in the art. Disclosure of Invention The application provides a fusion labeling method based on a neural network and related equipment, which are used for solving the technical problem of high labeling cost caused by error accumulation caused by physical equipment in the related technology. In order to achieve the above purpose, the present application adopts the following technical scheme: the embodiment of the first aspect of the application provides a fusion labeling method based on a neural network, which comprises the following steps: Acquiring point cloud data and image data through preset acquisition equipment, wherein the preset acquisition equipment is configured with point cloud acquisition equipment and graph acquisition equipment; Acquiring point cloud acquisition parameters of the point cloud data, image acquisition parameters of the image data and calibration parameters between the point cloud data and the image data, wherein the point cloud acquisition parameters comprise a point cloud time stamp and a preset acquisition equipment moving speed, and the image acquisition parameters comprise an image time stamp and equipment parameters of the image acquisition equipment; determining a pseudo 3D frame of a target object in the image data based on the point cloud data; And labeling a labeling frame of the target object in the image data based on the point cloud acquisition parameters, the image acquisition parameters, the pseudo 3D frame, the calibration parameters and the trained image labeling model. According to the technical means, the information required for mapping the inaccurate pseudo 3D frame into the accurate annotation frame is learned in the point cloud acquisition parameters, the image acquisition parameters, the pseudo 3D frames and the calibration parameters through the image annotation model, the annotation frame is generated based on the information, the influence of error accumulation of physical equipment on the accuracy of the annotation frame is avoided, an annotator is not required to manually adjust each annotation frame determined based on the pseudo 3D frames, and the annotation cost of fusion annotation is reduced. Optionally, in one embodiment of the present application, the labeling frame labeling the target object in the image data based on the point cloud acquisition parameter, the image acquisition parameter, the pseudo 3D frame, the calibration parameter and the trained image labeling model specifically comprises determining a time difference based on the point cloud timestamp and the image timestamp, inputting the time difference, the moving speed, the pseudo 3D frame, the equipment parameter and the calibration parameter into the trained image labeling model, and outputting the labeling frame of the target object in the image data through the image labeling model. According to the technical means, the time difference and the moving speed are used as the input items of the image annotation model, so that the image annotation model can learn the motion errors of the point cloud data and the image data, and the motion error compensation can be performed when the annotation frame is generated, thereby solving the problem that the acquisition time of the point cloud data is very difficult to be completely different from the acquisition time of the image data, and further improving the accuracy of the annotation frame determined by the image annotation model. Optionally, in one embodiment of the present application, the determining the pseudo 3D frame of the target object in the image data based on the point cloud data specifically includes performing target detection on the point cloud data to obtain a 3D targe