CN-122024194-A - Multi-sensor semantic tag fusion method for fusing high-precision map

CN122024194ACN 122024194 ACN122024194 ACN 122024194ACN-122024194-A

Abstract

The invention provides a multi-sensor semantic tag fusion method for fusing a high-precision map. The multi-sensor semantic tag fusion method for fusing the high-precision map comprises the following steps of S1, data input, S2, preprocessing, S3, double-factor weight configuration, S4, semantic fusion, S5, verification, S6, anomaly correction and S7, outputting a result, namely outputting a semantic tag library passing the verification. The multi-sensor semantic tag fusion method for fusing the high-precision map has the advantages of being capable of remarkably improving the accuracy, consistency and system reliability of semantic tag fusion.

Inventors

CAI KEFANG
LIU SHUO
TANG RONGJIANG
CHEN ZHENGXIONG

Assignees

桂林电子科技大学

Dates

Publication Date: 20260512
Application Date: 20260203

Claims (8)

1. The multi-sensor semantic tag fusion method for fusing the high-precision map is characterized by comprising the following steps of: s1, data input, namely acquiring three-dimensional point cloud data of a road target acquired by a laser radar and two-dimensional image data of a road scene acquired by a visual camera, and simultaneously importing high-precision map data containing static semantic elements and road topology information; S2, preprocessing, namely carrying out sensor calibration and space alignment on the laser radar and the vision camera, and unifying data of the laser radar and the vision camera into the same coordinate system, converting global coordinates of a high-precision map into a vehicle local coordinate system, extracting outline, distance and point cloud density characteristics of a target from point cloud data, extracting texture, illumination intensity and target integrity characteristics from image data, and associating a static target perceived by a sensor with a static semantic element in the high-precision map; s3, double-factor weight configuration, namely calculating the basic weight of the sensor based on the illumination intensity of the image data, and calculating the map matching degree coefficient based on the spatial overlapping rate and attribute consistency of the semantic label perceived by the sensor and the static semantic element corresponding to the high-precision map; S4, semantic fusion, namely associating the laser radar and the visual camera with semantic tags generated by the same target, and carrying out attribute fusion based on the final weight to generate an initial global fusion tag; s5, checking, namely performing five-dimensional checking on the initial global fusion tag, wherein the five-dimensional checking comprises tag consistency checking, geometric semantic checking, confidence coefficient checking, static existence checking and dynamic rationality checking; s6, correcting the abnormality, namely if any one of the verification is not passed, executing the following operations: a. Correcting the static semantic tags by taking the high-precision map as a reference and/or correcting the dynamic semantic tags by combining road topology information of the high-precision map; b. The weight backtracking adjustment is executed, namely the weight of a responsible sensor causing the abnormality is reduced according to a preset punishment proportion, and the weight of another sensor is correspondingly increased according to a preset compensation proportion; c. Returning to the step S4 by using the adjusted weight to re-execute semantic fusion to form a correction closed loop until the retry times reach a preset time threshold; and S7, outputting a result, namely outputting the semantic tag library passing the verification.
2. The multi-sensor semantic tag fusion method for fusing a high-precision map according to claim 1, wherein in the two-factor weight configuration, the specific method for calculating the map matching degree coefficient comprises the following steps: When the spatial overlapping rate is more than or equal to 95% and the attributes are completely consistent, the matching degree coefficient of mapping is 1.2; When the spatial overlap ratio is between 85% and 95% or secondary attribute deviation exists, the matching degree coefficient of the mapping is 1.0; When the space overlapping rate is less than 85% or main attribute deviation exists, the matching degree coefficient of the mapping is 0.8; And when the matching degree coefficient is lower than 0.8, the anomaly correction is directly triggered.
3. The multi-sensor semantic tag fusion method of fusing high-precision maps of claim 1, wherein the confidence verification comprises calculating a confidence of a vision camera and a confidence of a lidar: Confidence level= (texture matching level X preset texture matching level weight+target integrity X preset target integrity weight) X illumination influence coefficient of the vision camera, wherein when image brightness is less than a preset brightness threshold value, the illumination influence coefficient is a preset low illumination coefficient; (the point cloud density X presets the point cloud density weight+the contour matching degree X presets the contour matching degree weight) X distance attenuation coefficient, wherein when the target distance > the preset distance threshold value, the distance attenuation coefficient is a preset long-distance attenuation coefficient.
4. The multi-sensor semantic tag fusion method for fusing high-precision maps according to claim 3, wherein confidence levels of the vision camera and the laser radar are not less than a preset minimum confidence level and a difference value of the vision camera and the laser radar is not more than a preset confidence level difference value threshold, and if the difference value is not more than the preset confidence level difference value threshold, the perception rationality is judged by combining the matching degree of the high-precision maps, and the confidence level is calibrated accordingly.
5. The multi-sensor semantic tag fusion method for fusing a high-precision map according to claim 1, wherein the calculation of the basic weight is based on a preset brightness threshold: When the image brightness is greater than a preset brightness threshold, the basic weight of the vision camera is the preset vision camera high illumination weight, and the basic weight of the laser radar is the preset laser radar high illumination weight; When the image brightness is less than or equal to a preset brightness threshold, the basic weight of the vision camera is the preset vision camera low illumination weight, and the basic weight of the laser radar is the preset laser radar low illumination weight.
6. The multi-sensor semantic tag fusion method for fusing a high-precision map according to claim 1, wherein in the preprocessing, errors of sensor calibration and space alignment are smaller than or equal to a preset alignment error threshold value, errors of high-precision map coordinate conversion are smaller than or equal to a preset coordinate conversion error threshold value, and association coverage rate of a static target and map elements is larger than or equal to a preset association coverage rate threshold value.
7. The multi-sensor semantic tag fusion method for fusing a high-precision map according to claim 1, wherein in the verification, the passing criteria of each dimension are: The tag consistency matching rate is more than or equal to a preset consistency matching rate threshold value; The geometric semantic size deviation is less than or equal to a preset size deviation threshold value, and the position deviation is less than or equal to a preset position deviation threshold value; The static existing deletion rate is less than or equal to a preset deletion rate threshold value; The dynamic rationality compliance rate is more than or equal to a preset compliance rate threshold.
8. The multi-sensor semantic tag fusion method for fusing a high-precision map according to claim 1, wherein the preset penalty ratio is 10% -20% for reducing the weight of a responsible sensor, the preset compensation ratio is 5% -10% for lifting the weight of another sensor, the preset frequency threshold is 3 times, the preset brightness threshold is 50cd/m2, the preset low-light-level coefficient is 0.8, the preset distance threshold is 50m, the preset far-distance attenuation coefficient is 0.7, the preset texture matching degree weight is 0.6, the preset target integrity weight is 0.4, the preset point cloud density weight is 0.5, the preset contour matching degree weight is 0.5, the preset minimum confidence degree is 0.8, the preset confidence degree difference threshold is 0.2, the preset vision camera high-light weight is 30%, the preset vision camera low-light-level weight is 30%, the preset radar low-light-level threshold is 50m, the preset target integrity weight is 0.4, the preset point cloud density weight is 0.5, the preset contour matching degree weight is 0.5, the preset minimum confidence degree difference threshold is 0.2, the preset vision camera high-light-level difference threshold is 70%, the preset radar high-light-level difference threshold is 0.95%, and the error threshold is 0.98%, the error threshold is 0.95%.

Description

Multi-sensor semantic tag fusion method for fusing high-precision map Technical Field The invention relates to the technical field of automatic driving, in particular to a multi-sensor semantic tag fusion method for fusing high-precision maps. Background With the development of autopilot technology, multi-sensor fusion has become a core technology for environmental awareness. The illumination interference resistance advantage of the laser radar is complementary with the rich texture recognition capability of the vision camera, so that accurate perception of a road target can be realized. The semantic tag is used as the core output of the perception result, and the reliability of the semantic tag directly determines the safety of the automatic driving path planning and obstacle avoidance decision. The high-precision map provides a global reference for multi-sensor sensing by virtue of static semantics and road topology information with centimeter-level precision. The multi-sensor semantic tag fusion scheme in the prior art mainly has the defects that (1) fixed sensor weights are preset only based on environmental scenes (such as illumination) and cannot respond to real-time perception quality fluctuation of a sensor to a specific target, (2) global reference function of a high-precision map is not fully utilized and the problems of local false detection and missing detection of the sensor cannot be effectively corrected, (3) a multi-dimensional verification mechanism of a system is lacking, the sensor perception contradiction and tag abnormality are difficult to identify, and (4) the abnormality correction mechanism is single and is not linked with the sensor weights to perform backtracking optimization, so that an unreliable sensor continuously influences a subsequent fusion result. These drawbacks lead to the difficulty in the accuracy and stability of the existing solutions to meet the high safety requirements of autopilot. Therefore, it is necessary to provide a new multi-sensor semantic tag fusion method for fusing high-precision maps to solve the above technical problems. Disclosure of Invention The invention solves the technical problem of providing a multi-sensor semantic tag fusion method for fusing high-precision maps, which can remarkably improve the accuracy, consistency and system reliability of semantic tag fusion. In order to solve the technical problems, the multi-sensor semantic tag fusion method for fusing high-precision maps provided by the invention comprises the following steps of: s1, data input, namely acquiring three-dimensional point cloud data of a road target acquired by a laser radar and two-dimensional image data of a road scene acquired by a visual camera, and simultaneously importing high-precision map data containing static semantic elements and road topology information; S2, preprocessing, namely carrying out sensor calibration and space alignment on the laser radar and the vision camera, and unifying data of the laser radar and the vision camera into the same coordinate system, converting global coordinates of a high-precision map into a vehicle local coordinate system, extracting outline, distance and point cloud density characteristics of a target from point cloud data, extracting texture, illumination intensity and target integrity characteristics from image data, and associating a static target perceived by a sensor with a static semantic element in the high-precision map; s3, double-factor weight configuration, namely calculating the basic weight of the sensor based on the illumination intensity of the image data, and calculating the map matching degree coefficient based on the spatial overlapping rate and attribute consistency of the semantic label perceived by the sensor and the static semantic element corresponding to the high-precision map; S4, semantic fusion, namely associating the laser radar and the visual camera with semantic tags generated by the same target, and carrying out attribute fusion based on the final weight to generate an initial global fusion tag; s5, checking, namely performing five-dimensional checking on the initial global fusion tag, wherein the five-dimensional checking comprises tag consistency checking, geometric semantic checking, confidence coefficient checking, static existence checking and dynamic rationality checking; s6, correcting the abnormality, namely if any one of the verification is not passed, executing the following operations: a. Correcting the static semantic tags by taking the high-precision map as a reference and/or correcting the dynamic semantic tags by combining road topology information of the high-precision map; b. The weight backtracking adjustment is executed, namely the weight of a responsible sensor causing the abnormality is reduced according to a preset punishment proportion, and the weight of another sensor is correspondingly increased according to a preset compensation proportion; c. Returning to the step S4 by