CN-121978933-A - Sensing control cooperative automatic driving method based on online evolutionary learning

CN121978933ACN 121978933 ACN121978933 ACN 121978933ACN-121978933-A

Abstract

The invention provides a sensing control cooperative automatic driving method based on online evolution learning, and relates to the technical field of automatic driving. The sensing control collaborative automatic driving method based on online evolution learning specifically comprises the following steps of S1, acquiring and synchronizing multi-mode data, wherein original observation data from vehicle-mounted multi-type sensors at the current moment t are acquired and fused, S2, forward reasoning and track generation are carried out, the fused observation data are input into an automatic driving model to carry out forward reasoning, a plurality of candidate tracks of the own vehicle in a plurality of time steps in the future and probability distribution of the candidate tracks are output, and the automatic driving model at least comprises a perception encoder, a behavior predictor and a track planner. The invention can continuously optimize the perception, prediction and planning modules in the actual deployment process so as to cope with the problems of unseen scenes and distribution deviation in the training stage.

Inventors

SONG LIANG
Jin Jiayue
WANG YINGSHUO
Qian lang
ZHANG JINGYU
JU CHUANYU

Assignees

复旦大学

Dates

Publication Date: 20260505
Application Date: 20260129

Claims (9)

1. The sensing control cooperative automatic driving method based on online evolution learning is characterized by comprising the following steps of: Step S1, multi-mode data acquisition and synchronization, namely acquiring and fusing original observation data from a vehicle-mounted multi-type sensor at the current moment t; S2, forward reasoning and track generation, namely inputting the fused observation data into an automatic driving model to perform forward reasoning, and outputting a plurality of candidate tracks and probability distribution thereof of a host vehicle in a plurality of time steps in the future; Step S3, uncertainty evaluation and update triggering, namely calculating information entropy based on the track probability distribution output in the step S2, and triggering an online evolution learning process when the information entropy is larger than a dynamic threshold value based on historical data statistics; Step S4, on-line evolution learning based on motion compensation loss minimization, after updating is triggered, calculating a motion compensation loss difference value of a perception encoder and a behavior predictor through a dynamic model according to a control physical track, and respectively performing self-adaptive on-line optimization on perception and prediction through the minimized difference value; And S5, iterating and closing the loop, namely using the updated model to process the observation data at the subsequent moment, and repeating the steps S1 to S4.
2. The method for collaborative automatic driving sensing based on online evolutionary learning of claim 1, wherein the online evolutionary learning based on attention and self-supervision in step S4 is characterized in that after the update is triggered: S4-1, extracting attention weights of all surrounding dynamic obstacles from the track planner, and screening out key obstacles with the top L attention weights, wherein L is a positive integer; S4-2, for each key obstacle i, acquiring the position of the key obstacle i at the next time t+1 predicted at the current time t And speed of And obtains the observation position at the real time t+1 And observation speed ; S4-3, constructing a motion compensation loss function Wherein λ is a smoothed weight coefficient; S4-4, utilizing the loss function at a preset minimum learning rate And calculating gradients, and synchronously fine-tuning parameters of the perceptual encoder and the behavior predictor, while keeping parameters of the trajectory planner unchanged.
3. The method for collaborative automatic driving based on online evolutionary learning according to claim 1, wherein in step S3, the dynamic threshold is set by counting the information entropy sequence of the planning decision on the verification set corresponding to the original training data of the automatic driving model, calculating the mean value μ and standard deviation σ of the sequence, and setting the trigger threshold τ to be Where k is an adjustable sensitivity coefficient.
4. The method for collaborative driving based on online evolutionary learning according to claim 3, wherein the adjustable sensitivity coefficient k has a value ranging from 0.6 to 0.8.
5. The method for collaborative automatic driving based on online evolutionary learning according to claim 2, wherein the preferred number of L in step S4-1 is in the range of 3 to 8.
6. The method for collaborative driving based on online evolutionary learning according to claim 2, wherein in step S4-3, the value of the smoothing weight coefficient λ ranges from 0.1 to 0.3.
7. The method for collaborative automatic driving based on online evolutionary learning according to claim 2, wherein in step S4-4, the range of the preset minimum learning rate is 。
8. An electronic device comprising at least one processor and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor, which when executed by the at least one processor, enable the at least one processor to perform the method of any one of claims 1-7.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1 to 7.

Description

Sensing control cooperative automatic driving method based on online evolutionary learning Technical Field The invention relates to the technical field of intersection of artificial intelligence, automatic driving and computer vision, in particular to a sensing control cooperative automatic driving method based on online evolutionary learning. Background With the development of deep learning technology, the automatic driving system has made remarkable progress in the aspects of perception, decision and control. Currently mainstream autopilot system architecture can be divided into two broad categories, modular pipeline architecture and end-to-end architecture. The modularized architecture decomposes the automatic driving task into independent modules such as sensing, predicting, planning, controlling and the like, and has the advantages of strong interpretability and easy debugging, but has the problems of error accumulation and information loss among modules. The end-to-end architecture directly inputs and generates control instructions or tracks by the sensors through a unified neural network model, so that global optimization can be realized, but the requirements on data quality and scale are extremely high, and the interpretability is weaker. Whether modular or end-to-end architecture, existing systems mostly follow a static paradigm of "training-freeze-deployment". That is, the model is trained during the offline phase using a fixed data set, and once deployed, its parameters are no longer updated. This paradigm is based on a key assumption that the training environment is consistent with the data distribution of the test environment. However, in a real autopilot application scenario, this assumption is often broken. Vehicles may travel between areas of different cities, different weather conditions, different traffic regulations, resulting in significant distribution shifts in environmental data. The cross-regional distribution deviation can directly lead to the performance reduction of the model, and is manifested as the problems of missed detection perception, inaccurate prediction, improper planning and the like, thereby seriously threatening the driving safety. To alleviate the distribution shift problem, researchers have proposed various methods. Adaptive domain techniques attempt to generalize a model from one source domain to a target domain by learning domain invariant features or performing feature alignment. However, such methods often require access to the target domain data during the training phase, or complex countermeasure training, which is difficult to accommodate in a continuously changing, unpredictable real environment. Test-time adaptation is another type of emerging methodology that allows models to be adjusted slightly during the inference phase using unlabeled test data. However, the existing methods focus on sensing tasks such as image classification and segmentation, lack of closed loop cooperation with downstream prediction and planning tasks, and may be subject to local hysteresis or cause model collapse due to short-time update. Therefore, a core challenge faced in the current autopilot field is how to build an intelligent system that can continue learning and self-evolution after deployment, thus adapting to the dynamic open world. This requires the system to have online learning capabilities, inter-module co-optimization capabilities, and awareness of environmental uncertainty. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a sensing control cooperative automatic driving method based on online evolution learning, which solves the problems of static state, stiffness and difficulty in adapting to environmental changes of the existing automatic driving system. In order to achieve the aim, the invention is realized through the following technical scheme that the sensing control cooperative automatic driving method based on online evolution learning specifically comprises the following steps: Step S1, multi-mode data acquisition and synchronization, namely acquiring and fusing original observation data from a vehicle-mounted multi-type sensor at the current moment t; S2, forward reasoning and track generation, namely inputting the fused observation data into an automatic driving model to perform forward reasoning, and outputting a plurality of candidate tracks and probability distribution thereof of a host vehicle in a plurality of time steps in the future; Step S3, uncertainty evaluation and update triggering, namely calculating information entropy based on the track probability distribution output in the step S2, and triggering an online evolution learning process when the information entropy is larger than a dynamic threshold value based on historical data statistics; step S4, based on the attention and self-supervision online evolution learning, after the update is triggered: S4-1, extracting attention weights of all surrounding dyn