CN-121973772-A - Vehicle cooperative self-adaptive cruise control method based on potential driving motor

CN121973772ACN 121973772 ACN121973772 ACN 121973772ACN-121973772-A

Abstract

The invention relates to the technical field of intelligent traffic system and automatic driving vehicle control, and discloses a vehicle collaborative self-adaptive cruise control method based on potential driving motor, which comprises the steps of configuring an independent actuator network, an evaluator network and a potential driving motor model based on variation inference for each networked automatic driving vehicle, extracting and quantifying the potential driving motor from the local observation of the vehicle through the model, further converting the unique influence of the motor on decision into potential rewards by adopting Monte Carlo sampling and KL divergence calculation, and the method is integrated with environmental rewards to form a mixed rewarding signal, so that the intelligent agent strategy is guided to be optimized accurately, the cooperation efficiency is improved, an algorithm is trained by adopting an online priority experience playback mechanism, the delay and single-point fault risk of centralized communication are effectively avoided, the robustness and the expandability of the system are enhanced, the method is trained by online priority experience playback, the centralized communication delay and single-point fault are avoided, and the robustness of the system is improved.

Inventors

YIN YUCONG
SUN CHAO
XING LIANGPING
DONG YUJIN
Su Yanxu
SUN CHANGYIN

Assignees

安徽大学

Dates

Publication Date: 20260505
Application Date: 20260204

Claims (10)

1. A vehicle cooperative self-adaptive cruise control method based on a potential driving motor is characterized by comprising the following steps: S1, firstly, vehicle dynamics modeling is carried out, then a multi-agent actuator-evaluator algorithm under a decentralization framework is constructed based on an actuator-evaluator framework of multi-agent reinforcement learning, an independent actuator network, an evaluator network and a potential driving motor model are configured for each networking automatic driving vehicle, the actuator network and the evaluator network are all of three-layer structures of a full connecting layer, a long-short-term memory layer and a full connecting layer, and the potential driving motor model is used for deducing potential effective information under a vehicle cooperative self-adaptive cruise control scene; S2, each networking automatic driving vehicle acquires state information of front and rear vehicles through vehicle-to-vehicle communication, the state information is used as a self observation space, the self observation space is input into a potential driving motor model, and potential driving motors of each networking automatic driving vehicle are deduced, so that input of an actuator network is generated, and further longitudinal control decision of the vehicle is realized; and S3, finally, based on the multi-agent actuator-evaluator algorithm in the S1, coordinating the speed and the inter-vehicle distance of each networking automatic driving vehicle in the motorcade.
2. The vehicle collaborative self-adaptive cruise control method based on potential driving motors according to claim 1 is characterized in that in step S1, vehicle dynamics modeling is carried out, namely longitudinal dynamics modeling is firstly carried out, wherein the longitudinal dynamics modeling comprises a continuous model and a discrete model of vehicle longitudinal dynamics and is used for describing dynamic relations of vehicle distance, speed and acceleration, then a vehicle behavior model is constructed based on an optimal speed model, the model accurately reflects a vehicle dynamic adjustment rule under real traffic conditions through quantitative analysis of driving behaviors, and finally multi-agent reinforcement learning normalization is carried out, namely, the vehicle collaborative self-adaptive cruise control problem is formed into an observable Markov decision process of a decentralization part.
3. The vehicle collaborative self-adaptive cruise control method based on the potential driving motor according to claim 1, wherein in the step S2, the local observation information of the fleet vehicle is subjected to preprocessing, specifically, abnormal data is judged by adopting a dynamic constraint threshold value, the abnormal data is supplemented by smooth interpolation of the observation values at the front and rear moments, the multi-source state information is integrated by combining a sensor fusion technology, the data time sequence consistency is ensured by time stamp alignment, and the preprocessed observation values replace global states to be input into the potential driving motor model.
4. The vehicle collaborative adaptive cruise control method based on potential driving dynamics of claim 1, wherein the first two layers of the evaluator network employ a ReLU activation function for enhancing nonlinear expression capabilities, the last layer employs a linear activation function for outputting a cost function, and the last layer of the actuator network employs a softmax activation function for adapting an action space.
5. The vehicle collaborative adaptive cruise control method based on potential driving motor according to claim 1 is characterized in that the potential driving motor model is built based on a variational inference method in step S1, and the method specifically comprises the steps of firstly defining prior distribution and posterior distribution, modeling the prior distribution and the posterior distribution as multiple diagonal Gaussian distribution, then generating network approximation prior distribution and inferring network approximation posterior distribution through generating network approximation, generating network and inferring network approximation three-layer neural network architecture, setting a data generator which is input as potential motor of all networked automatic driving vehicles and outputs as distribution of global state, and finally optimizing potential driving motor model parameters by minimizing the loss function based on an evidence lower bound construction function.
6. The method for collaborative self-adaptive cruise control of a vehicle based on a potential driving motor according to claim 5, wherein the three-layer neural network architecture of the generating network and the deducing network is characterized in that a first layer is a full-connection layer, a second layer is a long-short-term memory LSTM layer for capturing time dependence of sequence data, and a third layer is composed of two full-connection structures for respectively outputting a multiple mean value and a variance.
7. A vehicle cooperative adaptive cruise control method based on potential drive motor as set forth in claim 1, further comprising potential reward signals, in particular by first defining an actual strategy The method comprises the steps of determining a training process, wherein an actual strategy comprises potential motivations, the actual strategy does not comprise potential motivations, then, adopting a Monte Carlo method to approximately calculate the actual strategy, secondly, adopting KL divergence to calculate the difference between the actual strategy and the actual strategy, defining the difference as potential rewards, carrying out minimum-maximum normalization treatment on the result, namely, the minimum-maximum normalization extremum of the potential rewards, determining based on potential rewards statistics results of the training process, taking the minimum value of the potential rewards of all samples in the training process as a normalization lower limit and the maximum value as an upper limit, forming a statistics interval adapting to a cooperative adaptive cruise control scene, and finally, constructing a mixed rewards signal and updating evaluator network and executor network parameters by utilizing the mixed rewards signal.
8. The method for collaborative adaptive cruise control for a vehicle based on a potential drive according to claim 7 wherein the Monte Carlo approach approximates a counter fact strategy by taking at least two independent co-distributed samples from a posterior distribution and then averaging as an approximation.
9. The vehicle collaborative self-adaptive cruise control method based on potential driving motor according to claim 1, further comprising a training process of a multi-agent actuator-evaluator algorithm, wherein the multi-agent actuator-evaluator algorithm adopts an online training mode, specifically, firstly, an online playback buffer zone D is established for storing conversion data, then, when the data amount stored in the buffer zone D reaches a preset threshold value, a certain number of small batches of samples are sampled from the buffer zone D, and finally, the evaluator network, the actuator network and the model parameters of the potential driving motor are respectively updated by utilizing the small batches of samples, so that the online training of the algorithm is realized.
10. The vehicle collaborative self-adaptive cruise control method based on potential driving motor according to claim 9, wherein the parameter update priority rules of the evaluator network, the actuator network and the potential driving motor model are that the evaluator network parameter is updated first, the actuator network parameter is updated second, and the potential driving motor model parameter is updated last, and a gradient clipping threshold value when the actuator network and the potential driving motor model parameter are updated is set to be 1.0 for suppressing gradient explosion and guaranteeing training stability.

Description

Vehicle cooperative self-adaptive cruise control method based on potential driving motor Technical Field The invention relates to the technical field of intelligent traffic systems and automatic driving vehicle control, in particular to a vehicle cooperative self-adaptive cruise control method based on a potential driving motor. Background With the rapid development of intelligent transportation and automatic driving technologies, cooperative Adaptive Cruise Control (CACC) is widely focused as a core technology for improving the driving safety and efficiency of a fleet, the CACC technology realizes information interaction among networked automatic driving vehicles (CAVs) through vehicle-to-vehicle (V2V) communication, coordinates the speed and the distance between the fleet, and has the core aim of solving the problem of the series stability of the fleet, and at present, the existing CACC mostly adopts a traditional control strategy or a basic MARL algorithm, but has obvious defects. The traditional CACC control method realizes control by establishing an optimization model and adding series stability constraint, can ensure that a motorcade is basically stable under ideal working conditions, but lacks adaptability to complex dynamic traffic environments, has delayed control response when facing disturbance such as sudden deceleration of a front vehicle, random fluctuation of driving behaviors and the like, and cannot flexibly adjust decisions. The method based on MARL is an important research direction of the CACC technology by virtue of the advantages of the method in the sequential decision problem, but the existing basic MARL algorithm has key technical bottlenecks in a multi-vehicle cooperative scene, namely, firstly, the problems of information homogenization and excessive generalization are prominent, most of the communication-based MARL algorithms (such as CommNet, conseNet) are difficult to distinguish state differences of different vehicles in a learning process, potential effective information in the CACC scene cannot be accurately captured, the cooperation among vehicles is caused to lack pertinence, response delay of a vehicle team is caused, secondly, collision risk and convergence efficiency are poor, the collision rate of the existing algorithms (such as FPrint) is up to 45% in the typical CACC scene such as catch up and deceleration, the convergence time is too long, stability of the vehicle team cannot be realized quickly, thirdly, the robustness is insufficient, the existing researches are difficult to cope with sudden disturbance such as real vehicle rapid deceleration and the like, uncertainty of human driving behaviors cannot be effectively adapted to, fourthly, the energy consumption optimization and integration are not carried out, the MARL method under most of the centering frames is focused on stability and safety, the vehicle is easy to delay, the energy consumption is not optimal, the relevant information is not fully influenced by the information, the relevant information is fully developed, the information is not fully influenced by the global system, and the relevant information is fully studied, and the requirements of the system are fully solved, and the information is fully has the important and has the important requirements of the information. Disclosure of Invention The invention aims to provide a vehicle cooperative self-adaptive cruise control method based on a potential driving motor, so as to solve the problems in the prior art. The abbreviation of MAACPM for multi-agent actuator-evaluator is defined in the application, CAV for networked automatic driving vehicles, PM for potential driving motor abbreviation, LSTM for short-term memory abbreviation, V2V for vehicle-to-vehicle abbreviation, CACC for collaborative adaptive cruise control abbreviation, OVM (OptimalVelocityModel) for optimal speed model abbreviation, dec-POMDP for observable Markov decision process, MARL for multi-agent reinforcement learning abbreviation. To achieve the above object, there is provided a vehicle cooperative adaptive cruise control method based on a potential driving motor, including the steps of: S1, firstly, vehicle dynamics modeling is carried out, then MAACPM algorithm under a decentralization framework is constructed based on MARL' S actuator-evaluator framework, independent actuator network, evaluator network and PM model are configured for each CAV, the actuator network and the evaluator network are all of a three-layer framework of a full connection layer+LSTM layer+full connection layer, and the PM model is used for deducing potential effective information or potential motivation under a vehicle CACC scene; S2, each CAV acquires state information of a front vehicle and a rear vehicle through V2V communication, takes the state information as a self-observation space, inputs the self-observation space into a PM model, and further deduces potential driving motors of each CAV, so that