Search

CN-116572993-B - Intelligent vehicle risk sensitive sequential behavior decision method, device and equipment

CN116572993BCN 116572993 BCN116572993 BCN 116572993BCN-116572993-B

Abstract

The application relates to an intelligent vehicle risk sensitive sequential behavior decision method, device and equipment, wherein the method comprises the steps of obtaining running state information of traffic participants in a preset traffic environment, constructing a dynamic objective function, determining a single-step behavior decision strategy of a vehicle based on the dynamic objective function, determining longitudinal and transverse dynamic safety margins in a decision process, identifying driving intention of surrounding vehicles based on the single-step behavior decision strategy, calculating cost values of the vehicles taking different behavior decision strategies to match with an optimal strategy of the vehicle, repeating the steps until the risk sensitive sequential decision strategy is consistent with the action of the vehicle at the current moment, and outputting an optimal track according to the risk sensitive sequential decision strategy. Therefore, the problems that the intelligent vehicle decision-making method has certain requirements on the capacity and quality of training data samples, is difficult to apply to actual complex dynamic scenes and the like are solved, and the intelligent vehicle dynamic multi-objective cooperation and multi-stage stable decision-making is realized in the complex scenes.

Inventors

  • HUANG HEYE
  • WANG JIANQIANG
  • CUI MINGYANG
  • LIU YICONG
  • HAN ZEYU
  • XU QING
  • LI KEQIANG

Assignees

  • 清华大学

Dates

Publication Date
20260505
Application Date
20230629

Claims (10)

  1. 1. The intelligent vehicle risk-sensitive sequential behavior decision method is characterized by comprising the following steps of: acquiring running state information of a traffic participant in a preset traffic environment; Based on the driving state information, constructing a dynamic objective function according to the risk sensitivity of the driver; Determining a vehicle single-step behavior decision strategy based on the dynamic objective function, the longitudinal dynamic safety margin and the transverse dynamic safety margin; Identifying the driving intention of surrounding vehicles based on the single-step behavior decision strategy, calculating the cost value of the current vehicle adopting different behavior decision strategies according to the driving intention, and matching the optimal strategy of the current vehicle according to the cost value, and Determining the action of the current vehicle at the current moment according to the optimal strategy of the current vehicle, judging whether the action quantity of the single-step output action decision strategy is consistent with the actual action quantity of the action output of the current vehicle at the current moment based on rolling time domain optimization, and acquiring the running state information of the traffic participant in the preset traffic environment again until the action quantity of the risk sensitive sequential decision strategy is consistent with the actual action quantity of the action output of the current vehicle at the current moment when the action quantity of the action decision strategy is inconsistent with the actual action quantity of the action output of the current vehicle at the current moment, and outputting an optimal track according to the risk sensitive sequential decision strategy.
  2. 2. The method of claim 1, wherein constructing a dynamic objective function from driver risk sensitivity based on the driving state information comprises: constructing the minimum action quantity in a real physical system, and determining a unified driving target in the driving decision process of the driver based on the driving state information; based on the unified driving objective, a dynamic objective function based on the minimum amount of action is output, Wherein the dynamic objective function is: ; Wherein i is an intelligent vehicle, For the dynamic objective function of the intelligent automobile i in the decision-making planning process, It is the initial time that is taken for the device to start, It is the end time that is to be taken, The Lagrangian equation of the two-vehicle system is represented by T i , the kinetic energy of the vehicle, and the potential energy of the system is represented by U i .
  3. 3. The method of claim 1, further comprising, prior to determining a vehicle single step behavior decision strategy based on the dynamic objective function, the longitudinal dynamic safety margin, and the lateral dynamic safety margin: Dividing the preset traffic environment into a plurality of two-vehicle systems formed by combining two vehicles according to the interaction between the vehicles based on the interaction relationship between the vehicles and the traffic participants; And determining the Lagrange equation of the two-vehicle system, and determining the longitudinal dynamic safety margin and the transverse dynamic safety margin in the decision process according to the Lagrange equation of the two-vehicle system.
  4. 4. A method according to claim 3, wherein the lagrangian equation for the two-vehicle system is: ; the lateral dynamic safety margin is: ; The longitudinal dynamic safety margin is: ; Wherein i represents a self-vehicle, T i is the kinetic energy of the vehicle, U i is the system potential energy, Is a vehicle Is used for the quality of the (a), As the speed of the vehicle i, As the vehicle speed of the vehicle j, It is the initial time that is taken for the device to start, It is the end time that is to be taken, Is the longitudinal restraint resistance of traffic regulations to the driver, Is a virtual driving force generated by driving the intelligent vehicle by the driver's driving target, For the longitudinal target driving force of the driver, For the lateral target driving force of the driver, For the longitudinal speed of the vehicle i, For the lateral speed of the vehicle i, And Representing the lateral restraining forces generated by the two lane lines of the driving lane of the target vehicle, Representing a vehicle For vehicles The resulting interaction risk forces; Is a vehicle And j the following distance in the longitudinal direction, For vehicles And The following distance in the transverse direction, Is a lateral safety margin of the vehicle, Is a positive correlation function that is a function of the positive correlation, Representing the step-back time-step, Is the longitudinal speed of the vehicle j.
  5. 5. The method of claim 1, wherein the risk-sensitive sequential decision strategy comprises: rolling optimization adjustment vehicle driving strategy in a time window to obtain the expression of the optimal dynamic objective function in a rolling time domain; And solving the functional extremum based on a preset variational method, and obtaining the risk sensitive sequential decision strategy according to the solving result.
  6. 6. The method of claim 5, wherein the optimal dynamic objective function is expressed in the rolling time domain as: ; ; Wherein S is the actual acting quantity, k represents the moment, Is the cost function of the cost function, Is the input vector which is to be used for the input, Is a state vector of the state of the object, Is a target set; in order to achieve the ideal effect, the preparation method is, Representing the time domain of scrolling, In order to increase the time of day, For the moment of time To the future time Is used for controlling the input value of the control signal, For the moment of time To the future time Is used to determine the predicted value of (c), Penalty term for end.
  7. 7. An intelligent vehicle risk-sensitive sequential behavior decision device, comprising: The traffic participant driving information acquisition module is used for acquiring driving state information of the traffic participant in a preset traffic environment; The dynamic objective function construction module is used for constructing a dynamic objective function according to the risk sensitivity of the driver based on the driving state information; the intelligent vehicle single-step behavior decision module is used for determining the vehicle single-step behavior decision strategy based on the dynamic objective function, the longitudinal dynamic safety margin and the transverse dynamic safety margin; the system comprises a behavior decision cost calculation and strategy selection module, a cost value matching module and a strategy selection module, wherein the behavior decision cost calculation and strategy selection module is used for identifying the driving intention of surrounding vehicles based on the single-step behavior decision strategy, calculating the cost value of the current vehicle adopting different behavior decision strategies according to the driving intention, and matching the optimal strategy of the current vehicle according to the cost value The construction and optimization module is used for determining the action of the current vehicle at the current moment according to the optimal strategy of the current vehicle, judging whether the action quantity of the single-step output action decision strategy is consistent with the actual action quantity of the action output of the current vehicle at the current moment or not based on rolling time domain optimization, and re-acquiring the running state information of the traffic participant in the preset traffic environment until the action quantity of the risk sensitive type sequential decision strategy is consistent with the actual action quantity of the action output of the current vehicle at the current moment when the action quantity of the action decision strategy is inconsistent with the actual action quantity of the action output of the current vehicle at the current moment, and outputting an optimal track according to the risk sensitive type sequential decision strategy.
  8. 8. The device according to claim 7, wherein the dynamic objective function construction module is specifically configured to: constructing the minimum action quantity in a real physical system, and determining a unified driving target in the driving decision process of the driver based on the driving state information; based on the unified driving objective, a dynamic objective function based on the minimum amount of action is output, Wherein the dynamic objective function is: ; Wherein i is an intelligent vehicle, For the dynamic objective function of the intelligent automobile i in the decision-making planning process, It is the initial time that is taken for the device to start, It is the end time that is to be taken, The Lagrangian equation of the two-vehicle system is represented by T i , the kinetic energy of the vehicle, and the potential energy of the system is represented by U i .
  9. 9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the program to implement the intelligent vehicle risk-sensitive sequential behavior decision method of any one of claims 1-6.
  10. 10. A computer readable storage medium having stored thereon a computer program, the program being executed by a processor for implementing the intelligent vehicle risk sensitive sequential behavior decision method of any of claims 1-6.

Description

Intelligent vehicle risk sensitive sequential behavior decision method, device and equipment Technical Field The application relates to the technical field of intelligent automobile application, in particular to a method, a device and equipment for deciding risk-sensitive sequential behaviors of an intelligent vehicle. Background In a complex scenario, the intelligent vehicle decision system needs to output a stable, continuous and reasonable decision strategy to meet the actual driving requirement. However, in the actual running process of the vehicle, the performance requirements are coupled in various ways, the multi-stage multi-scene decision process is incoherent, and a plurality of difficulties are brought to the multi-objective cooperation and multi-stage decision performance assurance related research of the intelligent vehicle. The key challenge is how to complete the sequential behavior decision of the vehicle in a dynamic environment and plan a feasible track so as to meet the performance targets of safety, high efficiency, reliability and the like. In a human-vehicle-road system, a driver takes multiple roles such as a decision maker, an experimenter and the like, and the driving perception-decision-control characteristics directly influence the vehicle control stability and safety. The potential risks of the driver to the traffic environment have a common cognitive mechanism and a common control law, but different types of risk sources have different influences on the driver, and the risk response of the driver can influence the scene understanding and decision strategy output of the driver, so that the driving safety is influenced. In the related art, research on advanced auxiliary driving systems and automatic driving aims at improving the intelligent level of vehicles, and the most suitable driving behaviors of the vehicles are selected to adapt to complex dynamic traffic environments. However, in the process of driving the vehicle by the driver, the complex time-varying traffic environment under the coupling action of multiple elements such as people, vehicles, roads and the like can be effectively treated, and the active decision principle is unified by 'perception-evaluation-decision' coordination under any scene instead of aiming at a single dangerous scene. Therefore, it is necessary to obtain a heuristic from an active decision mode adopted by a driver to cope with complex and changeable traffic environments, so that an automatic driving system can be self-adaptively opened and dynamic traffic scenes, and safe, reliable, quick and smooth autonomous driving in actual traffic is realized. In addition, aiming at various uncertainty factors such as randomness of traffic participant behaviors (including intention and track interaction randomness) in a complex dynamic interaction environment, dynamic state of traffic environment (static/dynamic blind area, sensor error and the like) and the like, the decision system for researching the intelligent vehicle is very critical, and the decision system is required to output a better and stable decision strategy in the vehicle driving process so as to meet the actual driving requirement. At present, a great deal of research on intelligent vehicle decisions is carried out at home and abroad. In the related art, the centralized decision framework is mainly based on an integrated thought, and based on environmental information received by a sensor, learning or self-trial-and-error and exploration of driving behavior data are performed through an end-to-end method (such as deep learning, reinforcement learning and the like), and based on input information of the sensor, a vehicle bottom layer control command is directly output. In the related art, the hierarchical decision framework decomposes the whole decision process into a series of sub-functional modules, and each functional module can be designed independently, and usually, the trajectory planning is performed after the decision is performed. The overall decision process under the hierarchical decision framework can be categorized into single-stage or single-stage behavioral decisions, multi-stage sequential decisions, or multi-stage behavioral decisions. The single-step behavior decision research method comprises a traditional rule/optimization decision method, a decision method based on probabilistic statistical reasoning, a decision method based on behavior interaction and the like. Multi-stage or multi-step sequential decision (or trajectory planning) research methods mainly include searching, interpolation, sampling, artificial potential energy fields, and the like. In the related technology, the data-driven centralized decision method does not need to rely on a limited expert rule to make decisions, and the strategy network directly generates test cases from real driving data, so that the integrity and instantaneity of the decisions are better. However, the essence of the centralized decisi