CN-121982894-A - Traffic signal dynamic intelligent control system, method, storage medium and electronic equipment

CN121982894ACN 121982894 ACN121982894 ACN 121982894ACN-121982894-A

Abstract

The invention discloses a traffic signal dynamic intelligent control system, a method, a storage medium and electronic equipment, which belong to the technical field of intelligent traffic control, and adopt a cloud, side and end three-layer cooperative architecture, wherein an edge server is used as a regional decision center to run a reinforcement learning intelligent body, the intelligent body forms a state vector according to multi-mode perception data, trains and decides through a multi-objective rewarding function fused with efficiency, safety, fairness, environmental protection and emergency targets, and outputs phase and duration control instructions of a signal lamp so as to realize dynamic multi-objective cooperative optimization of regional traffic flow. The invention solves the problems of rigidity control and single target of the traditional signal lamp, and can remarkably improve the passing efficiency, safety and environmental protection of the crossing.

Inventors

CUI WEIQIN

Assignees

崔卫勤

Dates

Publication Date: 20260505
Application Date: 20260212

Claims (10)

1. A traffic signal dynamic intelligent control system, comprising: a perception execution layer, an edge decision layer and a cloud optimization layer, wherein, The sensing execution layer comprises a multi-source sensing unit and an execution unit, and is used for monitoring intersection data and controlling a signal lamp to execute actions, wherein the multi-source sensing unit at least comprises a high-definition camera, a radar, a geomagnetic coil, a sensor group and an acoustic array; The edge decision layer is in communication connection with the perception execution layer and comprises an edge calculation server, wherein a reinforcement learning agent is used as a core decision engine of the edge calculation server and is used for executing dynamic intelligent control of traffic signals, and when the dynamic intelligent control of the traffic signals is carried out, a control strategy is generated in real time according to different dimension characteristics in a state characteristic vector corresponding to the monitored intersection data and is sent to the perception execution layer, and the control strategy at least comprises the phase and the duration of actions; the cloud optimization layer is in communication connection with the edge decision layer, and comprises a cloud server, wherein the cloud server is used for issuing a trained strategy network to the edge calculation server, and acquiring state feature vectors and experience data uploaded by the edge calculation server so as to continuously optimize the strategy network.
2. The traffic signal dynamic intelligent control system according to claim 1, wherein the status feature vector corresponding to the intersection data comprises at least traffic and queuing features, waiting and delay features, pedestrian and non-motor vehicle features, time and environment features, special event features and regional collaboration features, wherein; determining the regional cooperative characteristics based on the edge nodes or the vehicle-road cooperative units and Integrating and normalizing based on the sensing data of the multi-source sensing unit to generate corresponding state feature vectors, wherein the state feature vectors at least comprise: Determining the flow and queuing characteristics based on the high-definition camera, the radar and the geomagnetic coil; determining the waiting and delaying characteristics based on the high-definition camera and the geomagnetic coil; determining the characteristics of the pedestrians and the non-motor vehicles based on the high-definition camera and the sensor group; Determining the time and environmental characteristics based on the set of sensors; the special event feature is determined based on the acoustic array.
3. The traffic signal dynamic intelligent control method is characterized by being applied to the cloud optimization layer of claim 1, wherein the cloud optimization layer specifically comprises the following steps when optimizing a policy network: Calculating based on the state feature vector to obtain a corresponding sub-function; dynamically updating the weights of the sub-functions in a self-adaptive mode based on the state feature vectors; And carrying out weighted calculation on the reward function based on the sub-function and the sub-function weight correspondingly matched, and optimizing a strategy network by utilizing the reward function.
4. The traffic signal dynamic intelligent control method according to claim 3, wherein the calculating based on the state feature vector obtains a corresponding sub-function, specifically comprising: the sub-functions comprise an efficiency sub-function, a safety sub-function, a fairness sub-function, an environment-friendly sub-function and an emergency sub-function, wherein; the efficiency subfunction is related to flow and arrangement features and wait and delay features, and has the following calculation formula: R_efficiency = - [ α·ΔT_total + β·L_queue ]; wherein R_efficiency is an efficiency sub-function, deltaT_total is the total waiting time increment of all controlled vehicles in an action execution period, L_queue is the average queuing length of each lane when the period is finished, and alpha and beta are positive normalization coefficients for balancing time and space dimensions; the safety subfunction is related to special event characteristics and time and environment characteristics, and the calculation formula is as follows: R_safety = - [ γ·N_conflict + λ·ΣI_hardbrake ]; Wherein R_safety is a safety sub-function, N_ conflict is the potential conflict times of which the conflict time is smaller than a safety threshold value and predicted based on sensor data, I_ hardbrake is an emergency braking event indication function, and the occurrence time is 1, otherwise, the occurrence time is 0; the fairness sub-function is related to latency and delay characteristics and has the following calculation formula: R_fairness = - [ δ•T_max^2 ]; wherein R_fair is a fairness sub-function, and T_max is the maximum waiting time of the vehicle in all directions when the period is ended; the environment-friendly subfunction is related to the waiting and delay feature and the time and environment feature, and the calculation formula is as follows: R_environment = - [ η•ΣT_idle ]; Wherein R_environment is an environment-friendly subfunction, and ΣT_idle is the total idle time of all vehicles waiting in a period; the emergency subfunction is related to the special event characteristics and has the following calculation formula: R_emergency = + [ θ•I_clear_path ]; Wherein R_emigery is an emergency subfunction, I_clear_path is a Boolean function, if an unobstructed transit path is successfully planned and executed for the identified emergency vehicle, 1 is obtained, otherwise 0 is obtained, and θ is a positive reward for ensuring the absolute priority of the target.
5. The traffic signal dynamic intelligent control method according to claim 3, wherein the state feature vector-based adaptive dynamic update sub-function weight specifically comprises: detecting corresponding triggering conditions based on the state feature vector, wherein the triggering conditions at least comprise emergency priority triggering, safety risk triggering, congestion dredging triggering, fairness triggering and regularity policy triggering, and the function weights at least comprise efficiency weights, safety weights, fairness weights, environment protection weights and emergency weights, and the function weights at least comprise: When the emergency vehicle mark appears based on the special event feature research judgment, adjusting the emergency weight to be a match value, and adjusting the efficiency weight and the fairness weight to be temporary values; when the conflict frequency is greater than a preset conflict value and/or the visibility is smaller than a preset visibility value based on the special event characteristics, the time and the environment characteristics, the safety weight and the efficiency weight are adjusted; When the queuing length of any length is larger than a preset length value and/or the duration time of the average vehicle speed smaller than the preset vehicle speed is longer than the preset duration time based on the flow and arrangement characteristics and the waiting and delay characteristics, adjusting the efficiency weight; When the waiting time of the head car in any direction is larger than the preset waiting time based on waiting and delay characteristics, the fairness weight is adjusted; and when the target period appears based on the regional cooperative characteristics and the time and environmental characteristics, adjusting the efficiency weight, the safety weight and the environmental weight to be preset weights.
6. The traffic signal dynamic intelligent control method according to claim 3, wherein the weighting calculation of the reward function is performed based on the sub-functions and the corresponding matched weights of the sub-functions, specifically comprising: the sub-function weights comprise efficiency weights, safety weights, fairness weights, environment protection weights and emergency weights; Sequentially calculating the product of the value of the efficiency subfunction and the efficiency weight, the product of the value of the safety subfunction and the safety weight, the product of the value of the fairness subfunction and the fairness weight, the product of the value of the environment-friendly subfunction and the environment-friendly weight, and the product of the value of the emergency subfunction and the emergency weight, and then carrying out cumulative calculation on the reward function, wherein the calculation formula is as follows: ; Wherein, the As a function of the reward, 、 For the moment of time The corresponding state vectors before and after the decision-making, For the moment of time The execution action of the traffic light, Is the first The value of the individual sub-functions, Is the first The dynamic weights corresponding to the sub-functions, For the moment of time Is used to determine the dynamic weight of the model.
7. A traffic signal dynamic intelligent control method, which is characterized by being applied to the edge decision layer as claimed in claim 1, wherein the edge decision layer specifically comprises the following steps when dynamically and intelligently controlling traffic signals: acquiring state feature vectors corresponding to multi-source perception data, wherein the state feature vectors at least comprise flow and queuing features, waiting and delay features and special event features; Outputting a control strategy of the current traffic signal lamp by utilizing a strategy network issued by a cloud server based on the state feature vector, wherein the control strategy comprises a control phase and a control duration; And acquiring a control result of the current traffic signal lamp as experience data, and uploading the control result to the cloud server by combining the state feature vector.
8. The traffic signal dynamic intelligent control method according to claim 7, wherein the outputting the control policy of the current traffic signal by using the policy network issued by the cloud server based on the state feature vector specifically comprises: inputting the state feature vector into the strategy network, and calculating probability distribution or value scores of all executable actions of the traffic signal lamp through forward propagation; And selecting the executable action with the highest probability or highest score as the control strategy based on the reinforcement learning agent, wherein the executable action comprises the phase and the duration of the action.
9. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the traffic signal dynamic intelligent control method according to any one of claims 3 to 6 and/or the traffic signal dynamic intelligent control method according to any one of claims 7 to 8.
10. An electronic device, characterized in that the electronic device comprises a processor and a memory, wherein the memory is used for storing a computer program, and the processor is used for executing the computer program stored in the memory, so that the electronic device executes the traffic signal dynamic intelligent control method according to any one of claims 3 to 6 and/or the traffic signal dynamic intelligent control method according to any one of claims 7 to 8.

Description

Traffic signal dynamic intelligent control system, method, storage medium and electronic equipment Technical Field The invention belongs to the technical field of intelligent traffic control, and particularly relates to a traffic signal dynamic intelligent control system, a traffic signal dynamic intelligent control method, a storage medium and electronic equipment. Background The existing intelligent traffic signal control scheme is mostly based on fixed timing or simple self-adaptive rules, and although the reinforcement learning technology is partially introduced, the following basic defects still exist: 1. the optimization target is static, the preset optimization target weight is fixed, and the priority dynamic reconstruction can not be carried out when the emergency (such as accident and emergency vehicle) or the traffic state is suddenly changed. 2. And response lag, wherein strategy adjustment depends on long-term model retraining, and real-time strategy switching of seconds and minutes cannot be realized. 3. The degree of intelligence is insufficient, the system lacks "conditional reflex" capability based on real-time scenarios, and the essence is still automation under complex rules. Disclosure of Invention In view of the above drawbacks of the prior art, an object of the present invention is to provide a traffic signal dynamic intelligent control system, a method, a storage medium and an electronic device, which are used for solving the problems of the above-mentioned existing intelligent traffic signal control lamp that the optimization target is static, the policy response is lagged, and the degree of intelligence is insufficient. In a first aspect, the present invention provides a traffic signal dynamic intelligent control system, the system comprising: a perception execution layer, an edge decision layer and a cloud optimization layer, wherein, The sensing execution layer comprises a multi-source sensing unit and an execution unit, and is used for monitoring intersection data and controlling a signal lamp to execute actions, wherein the multi-source sensing unit at least comprises a high-definition camera, a radar, a geomagnetic coil, a sensor group and an acoustic array; The edge decision layer is in communication connection with the perception execution layer and comprises an edge calculation server, wherein a reinforcement learning agent is used as a core decision engine of the edge calculation server and is used for executing dynamic intelligent control of traffic signals, and when the dynamic intelligent control of the traffic signals is carried out, a control strategy is generated in real time according to different dimension characteristics in a state characteristic vector corresponding to the monitored intersection data and is sent to the perception execution layer, and the control strategy at least comprises the phase and the duration of actions; the cloud optimization layer is in communication connection with the edge decision layer, and comprises a cloud server, wherein the cloud server is used for issuing a trained strategy network to the edge calculation server, and acquiring state feature vectors and experience data uploaded by the edge calculation server so as to continuously optimize the strategy network. In some embodiments of the first aspect of the present application, the state feature vectors corresponding to the intersection data include at least traffic and queuing features, waiting and delay features, pedestrian and non-automotive features, time and environmental features, special event features, and regional collaboration features, wherein; determining the regional cooperative characteristics based on the edge nodes or the vehicle-road cooperative units and Integrating and normalizing based on the sensing data of the multi-source sensing unit to generate corresponding state feature vectors, wherein the state feature vectors at least comprise: Determining the flow and queuing characteristics based on the high-definition camera, the radar and the geomagnetic coil; determining the waiting and delaying characteristics based on the high-definition camera and the geomagnetic coil; determining the characteristics of the pedestrians and the non-motor vehicles based on the high-definition camera and the sensor group; Determining the time and environmental characteristics based on the set of sensors; the special event feature is determined based on the acoustic array. To achieve the above and other related objects, a second aspect of the present application provides a traffic signal dynamic intelligent control method applied to the cloud optimization layer, where the cloud optimization layer performs optimization on a policy network, the method includes the following steps: Calculating based on the state feature vector to obtain a corresponding sub-function; dynamically updating the weights of the sub-functions in a self-adaptive mode based on the state feature vectors; And car