CN-121982892-A - Intersection vehicle cooperative control method based on low-orbit satellite and reinforcement learning
Abstract
The invention discloses a crossing vehicle cooperative control method based on low-orbit satellites and reinforcement learning, which relates to the technical field of intelligent traffic and vehicle cooperative control, and comprises the following steps: the vehicle uploads state information to a ground cooperative control center through a satellite communication terminal, a data fusion module of the control center performs space-time alignment processing on the vehicle state information, an intersection traffic state vector is constructed, the state vector is input into a multi-agent reinforcement learning model, the traffic efficiency, the safety and the energy consumption are used as optimization targets, the hard constraint and the soft constraint of traffic regulations are introduced, a cooperative control strategy and a predicted track are output, the cooperative control strategy is issued to the vehicle through a satellite network, and a vehicle-mounted display system displays a suggested path to a driver.
Inventors
- HU XINGYU
- YIN CHENGLIANG
- WU YUEPENG
- GAO LIANGQUAN
- QIN WENGANG
Assignees
- 上海智能网联汽车技术中心有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260209
Claims (10)
- 1. The intersection vehicle cooperative control method based on the low-orbit satellite and reinforcement learning is characterized by comprising the following steps of: the information acquisition and uploading, namely establishing a communication link between a vehicle and a low-orbit satellite constellation through a vehicle-mounted satellite communication terminal, and uploading vehicle state information to a ground cooperative control center in real time through the low-orbit satellite constellation, wherein the vehicle state information comprises vehicle position coordinates, running speed, running direction and steering intention; The data fusion and state construction, wherein a data fusion module of the ground cooperative control center receives vehicle state information uploaded by all vehicles in an intersection area, performs space-time alignment processing on the vehicle state information, and constructs an intersection traffic state vector; Inputting the traffic state vector of the intersection into a pre-trained multi-agent reinforcement learning model, wherein the multi-agent reinforcement learning model takes each vehicle in the intersection area as an independent agent, takes the overall traffic efficiency, the driving safety and the vehicle energy consumption of the intersection as comprehensive optimization targets, and outputs a cooperative control strategy and a predicted driving track of each vehicle; and the ground cooperative control center transmits the cooperative control strategy and the predicted running track to corresponding vehicles through the low-orbit satellite constellation, and the vehicles display the recommended running path of the vehicle and the predicted running track of other vehicles to the driver through the vehicle-mounted display system.
- 2. The intersection vehicle cooperative control method based on low-orbit satellites and reinforcement learning according to claim 1 is characterized in that in information acquisition and uploading, the vehicle position coordinates are acquired through a vehicle-mounted high-precision positioning module, and the high-precision positioning module adopts a global navigation satellite system and inertial measurement unit combined navigation mode, so that the positioning precision reaches a centimeter level.
- 3. The intersection vehicle cooperative control method based on low-orbit satellites and reinforcement learning according to claim 1, wherein in data fusion and state construction, the space-time alignment process includes: performing time stamp calibration on the vehicle state information uploaded by each vehicle, and eliminating time deviation caused by transmission delay; And uniformly converting the position coordinates of each vehicle into a local coordinate system taking the center of the intersection as the origin.
- 4. The method for collaborative control of an intersection vehicle based on low orbit satellites and reinforcement learning according to claim 1 wherein the intersection traffic state vector comprises a position coordinate component, a velocity component, an acceleration component, a heading angle component, and a steering intent code component of each vehicle in data fusion and state construction.
- 5. The method for controlling the co-operation of vehicles at an intersection based on low-orbit satellites and reinforcement learning according to claim 1, wherein in the multi-agent reinforcement learning decision, the multi-agent reinforcement learning model adopts a centralized training distributed execution framework, comprising: An off-line training stage, namely training the model by utilizing historical traffic data in a simulation environment and learning an optimal cooperative strategy; the online execution stage is to output a cooperative control strategy according to the real-time crossing traffic state vector during actual running; The cooperative control strategy comprises a sequence of recommended driving speed, recommended acceleration and deceleration, recommended steering angle and traffic priority.
- 6. The intersection vehicle cooperative control method based on low-orbit satellites and reinforcement learning according to claim 1, wherein in a multi-agent reinforcement learning decision, the multi-agent reinforcement learning model introduces traffic regulation constraints including hard constraints and soft constraints; The hard constraint is a mandatory traffic rule which the vehicle must adhere to, and actions violating the hard constraint are directly forbidden when deciding; The soft constraint is a recommended traffic rule that the vehicle should adhere to, and actions that violate the soft constraint are given penalty terms in the reward function.
- 7. The intersection vehicle cooperative control method based on low-orbit satellites and reinforcement learning according to claim 1, wherein the multi-agent reinforcement learning decision further comprises collision detection and resolution: according to the predicted running track of each vehicle, calculating the minimum space-time distance of any two vehicles in a future preset time period; when the minimum space-time distance is smaller than a preset safety threshold, judging that potential traffic conflict exists; And aiming at vehicles with potential traffic conflict, the multi-agent reinforcement learning model regenerates an adjusted cooperative control strategy to enable the minimum space-time distance between the predicted running tracks of all the adjusted vehicles to be larger than the preset safety threshold.
- 8. The intersection vehicle cooperative control method based on low-orbit satellites and reinforcement learning according to claim 1, wherein in the cooperative signal issuing and displaying, the vehicle-mounted display system adopts a graphical interface to display the following contents: Displaying the current position and the suggested driving path of the vehicle in the form of a top view of the intersection; Marking other vehicles and predicting running tracks thereof with different colors; the potentially conflicting regions are identified in a highlighted manner.
- 9. The intersection vehicle cooperative control method based on low-orbit satellites and reinforcement learning according to claim 1, wherein when the low-orbit satellite constellation signal quality is lower than a preset signal threshold, the vehicle is switched to a 5G internet of vehicles communication module to communicate with the ground cooperative control center.
- 10. The intersection vehicle cooperative control method based on low-orbit satellite and reinforcement learning according to claim 1, wherein in the multi-agent reinforcement learning decision, the optimization function of the comprehensive optimization objective is expressed as: ; Wherein, the Indicating the value of the composite prize, Indicating a traffic efficiency bonus component, Representing a security reward component that is associated with the security reward component, Representing the energy consumption rewards component, The traffic efficiency weight coefficient is expressed and the value is 0.4, Represents the safety weight coefficient and takes a value of 0.4, The energy consumption weight coefficient is represented and the value is 0.2.
Description
Intersection vehicle cooperative control method based on low-orbit satellite and reinforcement learning Technical Field The invention relates to the technical field of intelligent traffic and vehicle cooperative control, in particular to a crossing vehicle cooperative control method based on low-orbit satellites and reinforcement learning. Background At present, crossroad vehicle traffic and cooperative control mainly depend on the following technologies: Traffic signal lamp control is used as the most common intersection traffic management mode, and vehicles are commanded to pass through traffic light signals with preset time sequences; the vehicle networking technology based on the road side unit enables the vehicle to interact with the road side communication unit, so that traffic state information is obtained; The direct communication technology among vehicles realizes the exchange of basic state information among vehicles through a special short-range communication technology; In addition, the automatic driving scheme based on the local sensor relies on cameras, laser radars and millimeter wave radars carried by the vehicle to sense the environment of the intersection, and the vehicle-mounted computing unit completes independent decision planning. However, when the prior art scheme is used for coping with complex and high-flow intersection meeting scenes, systematic defects exist, firstly, a traffic signal lamp control mode is stiff, a signal lamp during fixed time cannot adapt to traffic flow changing in real time, so that traffic efficiency is low, secondly, a cooperative technology based on vehicle-to-infrastructure communication and vehicle-to-vehicle communication is limited by communication coverage and reliability, a large-scale global optimization is difficult to realize in a network-free coverage area, thirdly, the problems of localization and conflict exist in the prior decision mode, a global view is lacking in decision based on vehicle-to-vehicle communication, each vehicle is planned with self optimization as a target, decision conflicts are easy to generate, moreover, the perception capability of a single vehicle sensor is limited by sight, weather and shielding factors, the accurate state and intention of a vehicle at the other side of an intersection cannot be known in advance, finally, the prior method cannot realize system-level energy efficiency optimization, and unnecessary energy consumption is caused by frequent starting, stopping and rapid acceleration and deceleration. Disclosure of Invention The invention aims to overcome the defects of the prior art, provides a crossing vehicle cooperative control method based on low-orbit satellites and reinforcement learning, and a vehicle cooperative control system integrating the low orbit satellite communication and the multi-agent reinforcement learning technology is constructed, so that the global optimal cooperative control of vehicles at the intersection is realized. In order to achieve the above purpose, the invention adopts the following technical scheme: A crossing vehicle cooperative control method based on low orbit satellite and reinforcement learning comprises the following steps: and the information acquisition and uploading step of establishing a communication link between the vehicle and a low-orbit satellite constellation through a vehicle-mounted satellite communication terminal, and uploading vehicle state information to a ground cooperative control center in real time through the low-orbit satellite constellation, wherein the vehicle state information comprises vehicle position coordinates, running speed, running direction and steering intention. And the data fusion and state construction, namely the data fusion module of the ground cooperative control center receives the vehicle state information uploaded by all vehicles in the intersection area, performs space-time alignment processing on the vehicle state information and constructs an intersection traffic state vector. Inputting the traffic state vector of the intersection into a pre-trained multi-agent reinforcement learning model, wherein the multi-agent reinforcement learning model takes each vehicle in the intersection area as an independent agent, takes the overall traffic efficiency, the driving safety and the vehicle energy consumption of the intersection as comprehensive optimization targets, and outputs a cooperative control strategy and a predicted driving track of each vehicle. And the ground cooperative control center transmits the cooperative control strategy and the predicted running track to corresponding vehicles through the low-orbit satellite constellation, and the vehicles display the recommended running path of the vehicle and the predicted running track of other vehicles to the driver through the vehicle-mounted display system. Further, in the information acquisition and uploading, the vehicle position coordinates are acquired through a vehicle-mo