CN-121977602-A - Vehicle path planning method based on road network division and graph reinforcement learning

CN121977602ACN 121977602 ACN121977602 ACN 121977602ACN-121977602-A

Abstract

The invention relates to the field of path planning, in particular to a vehicle path planning method based on road network division and graph reinforcement learning, which comprises the steps of forming a directed graph according to a traffic road network, dividing nodes in the graph into a plurality of control areas according to the directed graph, dividing the control areas into adjacent nodes between each area as control nodes according to the directed graph, giving weight to each side according to real-time traffic flow data, inputting the directed graph into a graph rolling network, extracting a road condition flow characteristic matrix, combining the flow characteristic matrix, a current position matrix and a destination position matrix to obtain the current state of an intelligent body, selecting a forward direction by the intelligent body according to a decision network in the current state, updating the state according to the selected action, and optimizing a running route. The invention can realize efficient path planning in complex and dynamically-changed traffic environments, and is suitable for optimizing the running route of the vehicle in real time.

Inventors

LI QIAN
WANG AO
LU XINGYU
LIU YANBING

Assignees

重庆邮电大学

Dates

Publication Date: 20260505
Application Date: 20260319

Claims (9)

1. A vehicle path planning method based on road network division and graph reinforcement learning is characterized by comprising the following steps: forming a directed graph according to a traffic network, wherein nodes in the graph represent traffic intersections, and edges represent road connections between the nodes; Normalizing the edge weights by adopting static indexes and dynamic indexes among nodes, and then dividing the traffic network into a plurality of control areas; the intelligent agent acquires a directed graph of a current control area, the directed graph is input into a graph convolution network to be extracted to obtain a road condition flow characteristic matrix, and the road condition flow characteristic matrix, the current position matrix and the destination position matrix are used as the current state of the intelligent agent; the intelligent agent selects the advancing direction according to the decision network in the current state, updates the state according to the selected action, and optimizes the driving route.
2. The vehicle path planning method based on road network division and graph reinforcement learning according to claim 1, wherein the real-time traffic flow data includes vehicle speed, vehicle queue length and road length, and the road network region R is divided into by adaptive network division Each road network area comprises n adjacent intersection nodes.
3. The vehicle path planning method based on road network division and graph reinforcement learning according to claim 2, wherein the control of the vehicle agent of each region R k is expressed as: ; ; Wherein, the Represent the first A set of vehicle actions for the personal area R k ; Representing action strategies adopted by the intelligent agent, wherein the action strategies comprise straight running, left turning, right turning and turning around; representing the kth road network area at time step t, The action of a single intersection in a control area under the t time step is represented, and i epsilon R represents that the ith intersection belongs to the control area R.
4. A vehicle path planning method based on road network division and graph reinforcement learning according to claim 1 or 2, characterized in that road network region R is divided into by adaptive network division The process of the individual areas includes: Cutting the graph structure by taking the minimum cutting cost as a target, judging whether the number of nodes in the obtained subgraph after cutting is larger than a set threshold value, and if so, continuing cutting the subgraph; and outputting each subgraph when the number of the nodes in each subgraph is smaller than or equal to a set threshold value, and obtaining each cut region.
5. The vehicle path planning method based on road network division and graph reinforcement learning according to claim 4, wherein the cutting cost is expressed as: ; ; Wherein, the Representing the graph Cut into subgraphs Sum subgraph Cost of (2); Representing subgraphs Sum subgraph Similarity within the region of (2); Representing subgraphs An internal degree of association; Representing subgraphs An internal degree of association; Representing subgraphs And picture of Is a degree of association of (1); Representing subgraphs And picture of Is a degree of association of (a) with each other.
6. The vehicle path planning method based on road network division and graph reinforcement learning of claim 5, wherein any two graphs are used 、 The degree of association between the two is expressed as: ; ; Wherein, the 、 ; Representation of the drawings Node in (a) And picture of Node in (a) The degree of association between the two; Is a node And node Static index between; Representing nodes Dynamic index of (2); Representing nodes Dynamic index of (2); Representing absolute value; Is a correction coefficient of the dynamic index.
7. A vehicle path planning method based on road network partitioning and graph reinforcement learning as claimed in claim 1 or 6, wherein the node And node Static index between Expressed as: Wherein, the Is a node And node Road length between; Is a node And node Indicating parameters of whether the two are directly connected or not, if so Otherwise ; Is a correction coefficient of the static index.
8. A vehicle path planning method based on road network partitioning and graph reinforcement learning as claimed in claim 1 or 6, wherein the node Dynamic index of (2) Expressed as: Wherein, the Representing nodes And node Congestion condition of road between based on traffic flow is defined as node And node The ratio of the traffic flow of the road to the number of lanes; Representing nodes And node The congestion condition of the road between based on the average vehicle speed, Is a node And node The average vehicle speed of the road between them, An average vehicle speed threshold value for road traffic transition from a clear state to a slightly congested state, An average vehicle speed threshold value for the transition of road traffic from medium congestion to heavy congestion.
9. The vehicle path planning method based on road network division and graph reinforcement learning according to claim 1, wherein the prize value obtained after the agent performs the action is expressed as: Wherein, the A prize value after the action is performed; the influence of traffic intersection flow at the current moment on the running speed of the vehicle is represented; speed represents the average traffic speed of the current road; Representing a running speed of the current vehicle; representing a straight line distance from the current location to the destination; Representing the straight line distance from the last position to the destination; The dynamic congestion index of the node i in the area where the current vehicle is located is given, and n is the number of nodes in the area where the current vehicle is located; , , representing the coefficient of influence.

Description

Vehicle path planning method based on road network division and graph reinforcement learning Technical Field The invention relates to the field of path planning, in particular to a vehicle path planning method based on road network division and graph reinforcement learning. Background In recent years, urban traffic congestion has become a rapid development due to urbanization, which is a serious problem. It has a significant impact on urban traffic networks resulting in additional travel time, fuel consumption and air consumption and increased pollution. Traffic congestion can be classified into Regular Congestion (RC) and non-regular congestion (NRC) according to the study in the middle. NRC is defined as the expected event of congestion caused by united nations systems, such as construction works, bad weather, accidents, and special events. Without expectations, NRC accounts, in contrast, have a greater proportion of traffic delays in urban areas due to the unpredictability of RC. There are three categories of methods to solve the problem of nuclear control, namely (1) detecting and predicting traffic congestion history and real-time sensor data by using both, (2) optimizing traffic signal control and management, and (3) vehicle route and navigation optimization. Among them, vehicle routing and navigation have been studied for the last decades as the most promising solutions. Classical vehicle path problems (VRPs) are defined as the smallest cost vehicle m to find a given number of combined routes to serve n customers. A typical example is a path planning company that collects and sends packages from delivery. Conventional VRPs are formed and categorized as NP-hard problems. Many optimization algorithms have been proposed to find sub-optimal solutions under different constraints, such as genetic, firefly, hybrid, or backbone algorithms. However, this type of routing problem definition assumes that traffic conditions are relatively stable, targeting problems with travel promoters. Thus, these proposed optimization algorithms search for optimal solutions in a static environment. Unlike classical VRPs, the present study is directed to the shortest travel time under dynamic traffic conditions for a vehicle to control the path and navigation problems of the vehicle as it travels from a given starting node to a target node. This goal is to find a path or set of optimal actions to minimize the travel time overhead of collecting current traffic flow input in real time. Over the past few years, many heuristics and evolutions have proposed optimization algorithms to address this problem, although recent studies of vehicle routing and navigation have had reasonable results, with the most advanced approach having several drawbacks, firstly, the shortest path type of many approaches, for example, fail to react immediately to NRC problems in providing optimal solution traffic conditions worsening due to unpredictability of complexity. Second, it is difficult to solve the problem in real time when using an optimization algorithm, for example. Third, based on the optimized vehicle path navigation algorithm, self-evolution and self-adaptation cannot be performed. Therefore, for the vehicle environment dynamically updated in real time, the method has problems when being used, and only static optimization can be completed but not dynamic optimization can be achieved. And the real-time dynamic optimal vehicle navigation is completed, so that the problem of urban traffic network congestion is solved, and more strategies are provided for reducing urban congestion. With the development of artificial intelligence and machine learning technology, the application of graph reinforcement learning in intelligent traffic systems is becoming a research hotspot. Much research has focused on how to utilize graph structure models for path planning and optimization, especially in dynamic and complex environments. Levine et al (2016) further explored the use of deep reinforcement learning in dynamic path planning, and proposed a framework for optimizing navigation strategies through multiple training and environmental interactions. The researches provide theoretical support for the application of graph reinforcement learning in the traffic field, and a certain progress is made in actual operation. Despite some progress in research at home and abroad, several challenges remain. For example, how to guarantee the stability of path planning in highly dynamic and uncertain environments, how to effectively combine multi-source data (such as traffic flow, weather, accidents, etc.) to make real-time decisions, and these problems remain hot spots and difficulties in current research. With the development of the graph reinforcement learning technology, the problems are expected to break through in the future, so that the intelligence and the real-time response capability of the navigation system are further improved. In summary, the current study on