CN-121545369-B - Dynamic traffic guidance and signal control cooperative method under traffic accident

CN121545369BCN 121545369 BCN121545369 BCN 121545369BCN-121545369-B

Abstract

The invention belongs to the technical field of intelligent traffic, and relates to a dynamic traffic guidance and signal control cooperative method under traffic accidents, which comprises the steps of constructing a directed weight-containing complex road network model, acquiring a real-time running state of the road network and evaluating the road network; based on the complex road network model and the running state thereof, generating a path induction module by adopting a double Q learning algorithm; constructing a signal control intelligent agent based on a depth deterministic strategy gradient algorithm to generate a signal control module; the invention solves the technical problems that the existing traffic guidance and signal control are often managed and optimized as independent systems, lack of effective cooperative mechanisms and fail to form an integrated solution.

Inventors

HAO WEI
LI SHUXIN
ZHANG ZHAOLEI
Shen Zian
Yan Zhangcun

Assignees

长沙理工大学

Dates

Publication Date: 20260512
Application Date: 20260120

Claims (6)

1. The method for cooperating dynamic traffic guidance and signal control under traffic accidents is characterized by comprising the following steps: S1, constructing a directional weight-containing complex road network model, acquiring a real-time running state of a road network and evaluating the real-time running state; s2, generating a path induction module by adopting a double Q learning algorithm based on the complex road network model constructed in the S1 and the running state thereof; S3, constructing a signal control intelligent body based on a depth deterministic strategy gradient algorithm, wherein the intelligent body outputs green light duration of the next phase as action by sensing traffic state of an intersection, and learns by combining rewarding signals with vehicle delay and queuing length as cores to generate a signal control module; S4, establishing a bidirectional cooperative mechanism between the path induction module and the signal control module, and executing cooperative iteration in a rolling time window mode to form a closed loop cooperative optimization system of path induction and signal control; s4, the specific steps are as follows: s401, establishing a cooperative data interface between a path inducing module and a signal control module, wherein the path inducing module transmits the generated bypass path set, the periodical flow prediction value of each node and the flow direction proportion to the signal control module through the interface; S402, dynamically updating a path induction scheme and a signal timing scheme based on a preset rolling time window period through continuous data interaction and strategy iteration between a path induction module and a signal control module, so as to realize closed loop collaborative optimization of path selection and signal control; through continuous data interaction and iteration between the path inducing module and the signal control module in S402, the method comprises the following specific steps: S4021, a path induction module generates an optimal detour path scheme based on traffic state data acquired in real time, and predicts flow distribution results of each detour path in a rolling time window in the future; S4022, a signal control module receives a path flow prediction result output by a path induction module, optimizes green-signal ratio parameters of a key intersection by adopting a depth deterministic strategy gradient algorithm, and generates a new signal timing scheme adapting to flow change; s4023, the signal control module transmits the delay data and the saturation data of each node after optimization back to the path induction module in real time; S4024, updating the path impedance value and the path selection probability of each road section by the path induction module according to the received delay and saturation feedback data, and reallocating the bypass vehicle flow; S4025, repeating the processes of S4021 to S4024 by taking a preset rolling time window as an iteration period, realizing real-time coordination of a path selection strategy and a signal timing strategy, and ensuring that the system dynamically adapts to traffic flow changes.
2. The method for cooperation of dynamic traffic guidance and signal control in a traffic accident according to claim 1, wherein the step S1 specifically comprises the steps of: S101, constructing directed weight-containing complex road network graph model by utilizing original road network ; S102, acquiring a real-time running state of a road network by a way of combining data acquisition by a road side sensing unit with vehicle reporting data, and summarizing the running state to a control center; And S103, acquiring lane passing time, road use condition and queuing length information, calculating running time impedance values of all road sections by adopting a BPR function, and taking the impedance values as road section weights to evaluate the running state of the road network.
3. The method for cooperation of dynamic traffic guidance and signal control in a traffic accident according to claim 1, wherein the step S2 specifically comprises the steps of: s201, constructing a reinforcement learning model for path induction, wherein the reinforcement learning model comprises a first valence function network arranged in parallel And a second value function network An experience playback pool; S202, acquiring node state information and road segment state information of a road network, fusing the node state information and the road segment state information, and constructing an environment state vector representing the real-time running state of the road network ; S203, the environmental state vector is processed Input to the reinforcement learning model using Policy with probability Selecting a next-hop node from all legal adjacent nodes of the current node, and taking the next-hop node as an action a; S204, executing the action a, moving the recommended path from the current node to the next node, and generating a comprehensive rewarding value by the environment according to the new road network state after the action a is executed And a new environmental state vector ; S205 to be defined by an environmental state vector Action a, comprehensive prize value New environmental state vector The composed experience tuples are stored in an experience playback pool, experience data are sampled from the experience playback pool periodically, and the first valence function network is obtained through a double Q learning algorithm And a second value function network And (3) carrying out iterative updating on parameters of the path, optimizing path decision logic, and generating a path induction module.
4. A method for cooperation of dynamic traffic guidance and signal control in a traffic accident according to claim 3, wherein the composite prize value in S204 Awarding time of passage Path length rewards Bandwidth utilization rewards Node impedance rewards Targeting rewards As shown in formula (1): (1) Wherein: for the path length rewards to be applied, In order to reward the time of flight, For the purpose of a bandwidth utilization reward, A reward is given to the node impedance, The rewards are directed towards the goal, In order to integrate the prize values, The weight coefficients awarded for the path length, For the weight coefficient of the transit time awards, The weight coefficient awarded for bandwidth utilization, The weight coefficient awarded for the node impedance, Weight coefficients for targeting rewards.
5. The method for cooperation of dynamic traffic guidance and signal control in a traffic accident according to claim 1, wherein the step S3 specifically comprises the following steps: S301, constructing a signal control intelligent agent based on a depth deterministic strategy gradient DDPG, wherein the signal control intelligent agent comprises an actor network for outputting a control strategy and a criticism network for evaluating the strategy value; S302, constructing an Actor state and a Critic state, wherein the Actor state is used for observing the vehicle density and the vehicle queuing condition of the intersection entering the lane, and the state space of the Actor state Is defined as shown in formula (2): (2) Wherein: Is an Actor state space, and is used for generating a state signal, As a set of real numbers, For a set of entrance lanes of the fork, As the number of signal phases at the intersection, Is a binary set; S303, selecting an action according to the Actor state This action As the duration of the next phase, and satisfying the constraint that the duration of the next phase is not less than the minimum green time and not more than the maximum green time; S304, establishing a comprehensive reward and punishment function based on DDPG signal control algorithm The comprehensive reward and punishment function Including vehicle delay penalties Queuing length penalty with vehicle ; And S305, performing iterative training on the signal control agent based on the depth certainty strategy gradient DDPG, and generating a signal control module by optimizing parameters of an actor network and a criticism network.
6. A method for cooperation of dynamic traffic guidance and signal control in a traffic accident according to claim 3, wherein the node status information in S202 includes: node calculation utilization rate Storage utilization rate Current node marking Target node marking Inlet node marking The road section state information comprises the road section utilization rate Normalized delay of sum road section 。

Description

Dynamic traffic guidance and signal control cooperative method under traffic accident Technical Field The invention belongs to the technical field of intelligent traffic, and particularly relates to a dynamic traffic guidance and signal control cooperative method under traffic accidents. Background With the acceleration of the urban process and the continuous increase of the quantity of motor vehicles kept, the problem of urban traffic jam is increasingly prominent. Traffic accidents are taken as emergency events in a traffic system, and the occurrence of the traffic accidents not only can cause local traffic flow interruption, but also can cause large-scale traffic jams, thereby seriously affecting the normal running efficiency and the safety of urban traffic. The traditional traffic accident handling modes, such as manual command, fixed signal timing and the like, often have the problems of delayed response, untimely information transmission, inflexible control strategies and the like, and are difficult to effectively cope with complex traffic conditions caused by traffic accidents. In the prior art, a traffic management method for traffic accidents mainly comprises traffic guidance and signal control. Traffic guidance typically provides drivers with road condition information and suggested detouring paths by means of variable information signs (VMS), broadcasting, navigation systems, etc. to disperse traffic flow and relieve the stress of accident areas. However, the existing traffic guidance system often lacks real-time performance, dynamic performance and intelligence, and is difficult to provide an optimal guidance strategy according to the real-time influence range of traffic accidents and the dynamic change of traffic flow. For example, some systems may only provide static detour routes without considering the evolution of real-time road conditions, resulting in poor induction and even possible migration of congestion to other areas. In the aspect of signal control, the traditional signal control system mostly adopts fixed timing or induction control, the control logic is relatively simple, and rapid and self-adaptive adjustment is difficult to carry out according to severe fluctuation of traffic flow under the emergency such as traffic accident. When a traffic accident happens, the traffic demand and the traffic capacity of intersections around the accident area can be obviously changed, if the traffic demand and the traffic capacity of the intersections cannot be timely and reasonably adjusted when the traffic accident happens, a large amount of traffic flows are accumulated at the intersections, so that the congestion is aggravated, and even secondary accidents are caused. In addition, traffic guidance and signal control are often managed and optimized as independent systems, and an effective cooperative mechanism is lacking between the traffic guidance and the signal control, so that an integrated solution is not formed, and the improvement of the overall traffic guiding effect is limited. Therefore, how to realize the collaborative optimization of dynamic, real-time and intelligent traffic guidance and signal control under the traffic accident is a key problem to be solved in the current traffic management field. Disclosure of Invention The invention aims to provide a dynamic traffic guidance and signal control cooperative method under traffic accidents, which solves the problems of lack of cooperation of path guidance and signal control, response delay guidance and poor control strategy adaptability in the existing method, thereby improving the road traffic efficiency and network operation stability under the accident scene. In order to solve the technical problems, the technical scheme adopted by the invention is that the method for cooperating dynamic traffic guidance and signal control under traffic accidents comprises the following steps: S1, constructing a directional weight-containing complex road network model, acquiring a real-time running state of a road network and evaluating the real-time running state; s2, generating a path induction module by adopting a double Q learning algorithm based on the complex road network model constructed in the S1 and the running state thereof; S3, constructing a signal control intelligent body based on a depth deterministic strategy gradient algorithm, wherein the intelligent body outputs green light duration of the next phase as action by sensing traffic state of an intersection, and learns by combining rewarding signals with vehicle delay and queuing length as cores to generate a signal control module; And S4, establishing a bidirectional cooperative mechanism between the path induction module and the signal control module, and executing cooperative iteration in a rolling time window mode to form a closed loop cooperative optimization system for path induction and signal control. Further, the specific steps of S1 are as follows: S101, constructing di