CN-121973799-A - Intelligent vehicle end-to-end behavior decision method and system based on multi-mode information fusion

CN121973799ACN 121973799 ACN121973799 ACN 121973799ACN-121973799-A

Abstract

The invention discloses an intelligent vehicle end-to-end behavior decision method and system based on multi-mode information fusion, comprising the steps of acquiring RGB data and point cloud data of surrounding environment information; the method comprises the steps of performing distortion compensation on RGB data, performing translation and rotation operation on cloud data, converting the cloud data into a camera coordinate system, obtaining a front view of point cloud, obtaining fused dense depth information, inputting the dense depth information into a behavior decision network as environment perception data, outputting decision actions of an intelligent vehicle, obtaining reference instant returns of each decision moment, comparing the reference instant returns with accumulated returns of the decision actions, and restricting and correcting the decision actions based on a rule standby strategy when the accumulated returns of the decision actions are lower than the reference instant returns. The invention adopts the behavior decision network to generate the decision action of the intelligent vehicle, and implements the safety constraint based on the rule standby strategy, thereby ensuring the decision effectiveness and the driving safety of the intelligent vehicle in the complex dynamic road environment.

Inventors

GAO JIABING
CAO MINGCHUN
ZHAO WANZHONG
WANG CHUNYAN
ZHANG ZIYU

Assignees

南京航空航天大学

Dates

Publication Date: 20260505
Application Date: 20251230

Claims (7)

1. An intelligent vehicle end-to-end behavior decision method based on multi-mode information fusion is characterized by comprising the following steps: Step 1), acquiring RGB data and point cloud data of surrounding environment information through a vehicle-mounted camera and a laser radar of an intelligent vehicle; Step 2), carrying out distortion compensation on the RGB data; Step 3), carrying out translation and rotation operation on the cloud data, converting the cloud data into a camera coordinate system, and obtaining a front view of the point cloud by using a front view projection method; Step 4), carrying out multi-mode fusion on the RGB data and the point cloud data which are preprocessed in the step 2) and the step 3) to obtain fused dense depth information; Step 5), inputting the intensive depth information as environment perception data into a trained behavior decision network, and outputting decision actions of the intelligent vehicle; And 6) obtaining the reference instant return of each decision moment, comparing the reference instant return with the accumulated return of the decision action in the step 5), and restricting and correcting the decision action based on the rule standby strategy when the accumulated return of the decision action is lower than the reference instant return.
2. The intelligent vehicle end-to-end behavior decision method based on multi-modal information fusion according to claim 1, wherein the process of performing distortion compensation on RGB data in step 2) simultaneously considers radial and tangential distortions, which is expressed as follows: ; In the formula, 5 Distortion parameters; R is the radial distance from the sensing target to the optical center in a camera coordinate system; to correct the position coordinates of the perceived target.
3. The intelligent vehicle end-to-end behavior decision method based on multi-modal information fusion according to claim 2, wherein the specific steps of preprocessing the cloud data in the step 3) are as follows: 31 Defining the coordinate system of the laser radar as O L , defining the coordinate system of the camera as O C , and respectively marking the space position coordinates of the same perception target T under two coordinate systems as And The three-dimensional spatial transformation of the target T is represented as follows: ; Wherein T L is a space point in an O L coordinate system, T C is a space point in an O C coordinate system, R C is a space rotation matrix, and S C is a translation matrix; 32 Decomposing the spatial rotation matrix R C into angular variables rotating anticlockwise around the X-axis, Y-axis and Z-axis to obtain three rotation matrices 、 And The expression is as follows: ; In the formula, 、 And Rotation angles around the X axis, the Y axis and the Z axis, respectively; The spatial points are rotated in the order of the X axis, the Y axis and the Z axis, and three rotation matrices are multiplied to represent the conversion of the three-dimensional space state, which is represented as follows: ; 33 Using a translation matrix S C to translate the spatial point from the O L coordinate system to the O C coordinate system, as follows: ; 34 After the point cloud data is converted from the O L coordinate system to the O C coordinate system, carrying out forward projection on the point cloud data, calculating the projection position of each space point under the O C coordinate system, checking whether the position of each space point overflows an image boundary, and correspondingly cutting if the position of each space point overflows the image boundary.
4. The intelligent vehicle end-to-end behavior decision method based on multi-modal information fusion according to claim 3, wherein the multi-modal fusion of RGB data and point cloud data by using the depth information fusion network in the step 4) comprises the following specific steps: 41 Using a global transform encoder to obtain global context information from the RGB data and the point cloud data, respectively, the encoding features being represented as follows: ; In the formula, And Global coding features acquired from the RGB data and the point cloud data respectively; And N is the number of spatial points, enc is the global transform encoder; coding features for an nth spatial point in the RGB data; Is the nth point cloud data coding features of spatial points; 42 Bidirectional path function aggregation of the encoded features using top-down and bottom-up paths and dividing the encoder output into four groups, the outputs of the last blocks of the first, second, third and fourth groups being respectively represented as 、、 And Coding features of its composition As input to a multi-stage decoder; 43 The coding features of each group in the multi-level decoder are fed back to a deconvolution block, wherein the deconvolution block comprises two deconvolution layers of 4 multiplied by 4 and 16 multiplied by 16, normalization processing is carried out, high-resolution detail features are obtained through a connecting layer, the detail features are fused into an integral feature, and dense depth information is obtained, wherein the dense depth information is expressed as follows: ; ; In the formula, And Respectively obtaining high-resolution detail features from RGB data and point cloud data; The method comprises the steps of merging detailed features of RGB data and point cloud data, wherein Dec is a multi-stage decoder, and Conv is a convolution layer.
5. The method for determining the end-to-end behavior of the intelligent vehicle based on the multi-modal information fusion according to claim 1, wherein the specific steps of generating the decision action of the intelligent vehicle by using the trained behavior decision network in the step 5) are as follows: The behavior decision network consists of a deep neural network and comprises an input layer, an implicit layer, a normalization layer and an output layer, wherein the input of the input layer is a multi-mode fusion characteristic, the normalization layer adopts a Softmax activation function, and the intelligent vehicle behavior strategy output by the output layer is a decision action; 51 The training objective of the behavioral decision network is to minimize the cross entropy loss function, expressed as follows: ; In the formula, Is a parameter of the decision network; outputting the probability of the action for the decision network; Outputting the single-heat coded value of the action; 52 During training, state-action pairs in the experience pool are randomly assigned to multiple batches, parameters of the behavior decision network Directly through a bulk loss gradient Updating, loss gradient The calculation formula of (2) is as follows: ; In the formula, The number of random action tracks; The process of updating parameters of the behavior decision network through the loss gradient is expressed as follows: ; In the formula, The learning rate of the decision network; And Decision network parameters before updating and after updating in the kth training iteration are respectively carried out; 53 After the network parameters are updated once, the state-action pairs in the experience pool are disturbed and are re-randomly allocated into new batches, meanwhile, the gradient of batch loss is further calculated, the network parameters are updated correspondingly, and when the loss is stabilized at a low level, the iterative training is finished; 54 To the current state of the traffic scene) Inputting a trained behavior decision network, outputting a decision action of the intelligent vehicle, and representing as follows: ; In the formula, The decision action of the intelligent vehicle; Is a transfer function of the network.
6. The intelligent vehicle end-to-end behavior decision method based on multi-modal information fusion according to claim 5, wherein the specific steps of the step 6) are as follows: 61 Obtaining the current state of the traffic scene by using the rule reserve strategy Lower decision action And according to Solving the current instant return is used as a benchmark and expressed as follows: ; ; In the formula, A transfer function for a regular standby policy; The current optimal instant return is given; 62 Based on decision action) Sampling a decision track from the current T moment to the t+T moment, and calculating an accumulated return corresponding to the rule standby strategy in the decision process as follows: ; In the formula, Cumulative rewards for regular backup policies; Is a decision interval; as a discount factor, the number of times the discount is calculated, The optimal instant return at the time t+T is obtained; 63 Decision actions derived based on behavioral decision networks Sampling a decision track from the current T moment to the t+T moment, and solving the corresponding accumulated return as follows: ; In the formula, Cumulative rewards for the behavioral decision network; Reporting back the action at the moment t; The state value corresponding to the final state; Is that Action reporting at time; Action return at time t+T; 64 Judging whether the accumulated return of the decision action output by the behavior decision network is greater than or equal to the reference instant return obtained by the rule standby strategy, if so, regarding the accumulated return as effective and safe output, otherwise, correcting the decision action output by the behavior decision network based on the rule standby strategy.
7. An intelligent vehicle end-to-end behavior decision system based on multi-mode information fusion is characterized by comprising: The data acquisition module is used for acquiring RGB data and point cloud data of surrounding environment information through a vehicle-mounted camera and a laser radar of the intelligent vehicle; The first data processing module is used for performing distortion compensation on the RGB data so as to complete preprocessing of the RGB data; the second data processing module is used for carrying out translation and rotation operation on the cloud data, converting the cloud data into a camera coordinate system, and acquiring a front view of the point cloud by using a front view projection method so as to finish the preprocessing of the point cloud data; the data fusion module is used for carrying out multi-mode fusion on the preprocessed RGB data and the point cloud data to obtain fused dense depth information; the decision action output module is used for inputting the intensive depth information into the trained behavior decision network as environment perception data and outputting the decision action of the intelligent vehicle; And the decision optimization module is used for acquiring the reference instant return of each decision moment, comparing the reference instant return with the accumulated return of the decision action, and restricting and correcting the decision action based on the rule standby strategy when the accumulated return of the decision action is lower than the reference instant return.

Description

Intelligent vehicle end-to-end behavior decision method and system based on multi-mode information fusion Technical Field The invention belongs to the technical field of intelligent vehicle automatic driving behavior decision making, and particularly relates to an intelligent vehicle end-to-end behavior decision making method and system based on multi-mode information fusion. Background With the rapid development of the automobile industry, intelligent driving has become a research direction of important industrial attention. Intelligent vehicles generally follow the technical architecture of "sense-decision-execution". In the framework, a perception system firstly acquires multi-source environment information through sensors such as a vehicle-mounted camera and a laser radar and performs fusion processing, a decision system makes a safe behavior strategy and plans a reasonable decision track based on the perception information and combining road structures and traffic constraints, and finally, a bottom layer executing mechanism solves specific front wheel corner and accelerator/pedal signals through a tracking control algorithm according to the decision track to track a target track. As the brain of the intelligent vehicle, the decision system has a core role in behavior decision and track planning. In recent years, an end-to-end behavior decision technology has great potential for improving autonomy and driving safety of an intelligent vehicle, and is gradually one of main stream research directions in the field of intelligent driving. However, in a highly dynamic scene such as an urban road, the environmental information acquired by a single type of vehicle-mounted sensor often has a problem of insufficient robustness, and particularly has more obvious performance under complex conditions such as rain and snow, shielding, night, and the like. For example, cameras, while advantageous in terms of color recognition, are prone to failure of information acquisition due to boundary blurring in occlusion scenes. Likewise, lidar is excellent in spatial ranging by virtue of point cloud data containing rich depth information, but it is difficult to accurately discriminate the object type due to lack of texture information. The inherent limitations of these sensors make it difficult for end-to-end decision systems to obtain reliable, comprehensive environmental awareness information, thereby affecting the safety and effectiveness of behavioral decisions. Therefore, an intelligent vehicle end-to-end behavior decision method based on multi-mode information fusion is needed to improve the perceived robustness and decision security in a dynamic scene. Disclosure of Invention Aiming at the defects of the prior art, the invention aims to provide an intelligent vehicle end-to-end behavior decision method and system based on multi-mode information fusion, so as to solve the problem of decision risk increase caused by the perception limitation of a single vehicle-mounted sensor of an intelligent vehicle in the prior art. In order to achieve the above purpose, the invention adopts the following technical scheme: the invention discloses an intelligent vehicle end-to-end behavior decision method based on multi-mode information fusion, which comprises the following steps: Step 1), acquiring RGB data and point cloud data of surrounding environment information through a vehicle-mounted camera and a laser radar of an intelligent vehicle; Step 2), performing distortion compensation on the RGB data to complete preprocessing of the RGB data; step 3), carrying out translation and rotation operation on the cloud data, converting the cloud data into a camera coordinate system, and obtaining a front view of the point cloud by using a front view projection method so as to finish preprocessing the point cloud data; Step 4), carrying out multi-mode fusion on the RGB data and the point cloud data which are preprocessed in the step 2) and the step 3) to obtain fused dense depth information; Step 5), inputting the intensive depth information as environment perception data into a trained behavior decision network, and outputting decision actions of the intelligent vehicle; And 6) obtaining the reference instant return of each decision moment, comparing the reference instant return with the accumulated return of the decision action in the step 5), and restricting and correcting the decision action based on the rule standby strategy when the accumulated return of the decision action is lower than the reference instant return. Further, the process of performing distortion compensation on RGB data in the step 2) considers both radial and tangential distortions, which is expressed as follows: ; In the formula, 5 Distortion parameters; R is the radial distance from the sensing target to the optical center in a camera coordinate system; to correct the position coordinates of the perceived target. Further, the specific steps of preprocessing the cloud