CN-121979044-A - VLA-based low-speed park automatic driving method

CN121979044ACN 121979044 ACN121979044 ACN 121979044ACN-121979044-A

Abstract

The application discloses a VLA-based low-speed park automatic driving method, which relates to the technical field of automatic driving, and comprises the steps of collecting environment data of a plurality of parks, generating a panoramic park environment perception tensor, executing semantic element identification and dynamic main body movement trend prejudgment, and generating a park environment semantic analysis topology with dynamic movement trend; executing advanced obstacle avoidance path planning and stationarity path optimization on the campus environment semantic analysis topology with dynamic motion trend and a preset driving path to generate a campus automatic driving control instruction set with dynamic safety redundancy; executing the instruction set and collecting vehicle running state feedback data to generate a multi-mode-path planning collaborative optimization parameter tensor, and continuously adjusting automatic driving execution logic. The automatic driving can continuously adapt to scene change in long-term operation, and the operation stability and the robustness are improved.

Inventors

FAN ZHICHAO
WU XIAOYUAN
JIANG MANG
ZHANG YIZHANG
XU JINBO
YAO JINTIAN

Assignees

北京甲板智慧科技有限公司

Dates

Publication Date: 20260505
Application Date: 20260121

Claims (10)

1. A VLA-based low speed park autopilot method comprising: Step 1, collecting road image data, obstacle laser point cloud data, environment audio data, park voice interaction data and vehicle action state data of a park environment to generate a panoramic park environment perception tensor; step 2, semantic element identification and dynamic main body movement trend prejudgment are carried out on the panoramic park environment perception tensor, and park environment semantic analysis topology with dynamic movement trend is generated; Step 3, performing advanced obstacle avoidance path planning and stationarity path optimization on the park environment semantic analysis topology with dynamic motion trend and a preset driving path to generate a park automatic driving control instruction set with dynamic safety redundancy; And 4, executing a park automatic driving control instruction set with dynamic safety redundancy and collecting vehicle driving state feedback data to generate a multi-mode-path planning collaborative optimization parameter tensor, and continuously adjusting automatic driving execution logic based on the multi-mode-path planning collaborative optimization parameter tensor.
2. The VLA-based low speed park autopilot method of claim 1 wherein step 1 comprises: Step 11, performing space-time dimension reference alignment on road image data, obstacle laser point cloud data, environment audio data, park voice interaction data and vehicle action state data acquired by the VLA multi-mode heterogeneous sensing fusion unit to generate a multi-mode initial calibration data tensor; Step 12, performing multi-sensor precision weight assignment and scene suitability calibration on the multi-modal initial calibration data tensor to generate multi-modal weight adaptation data; And 13, carrying out multi-source conflict detection and weighting digestion on the multi-mode weight adaptive data, fusing to generate a panoramic park environment perception tensor, wherein the panoramic park environment perception tensor carries out tensor dimension linkage binding on road marking features, obstacle three-dimensional features, abnormal sound space features, road marking visual features, voice interaction semantic features and vehicle action state features, and the feature dimensions correspond to park automatic driving perception requirements one by one.
3. The VLA-based low speed park autopilot method of claim 2 wherein step 13 comprises: Step 131, performing feature conflict detection on the multi-mode weight adaptive data, and identifying conflict items such as visual-laser semantic contradiction, audio-visual space misplacement visual-voice semantic contradiction, action-visual state misplacement and the like to form a multi-mode feature conflict item directory; step 132, performing weighted resolution on the multimodal feature conflict list based on preset weights, and reserving feature results of high-weight modalities to obtain a single-mode feature array after the weight resolution; And 133, fusing the single-mode feature arrays after the weight digestion to generate a panoramic park environment perception tensor, wherein the feature credibility of the panoramic park environment perception tensor is obtained by weighting calculation of each mode weight, and the three types of environment features are subjected to unified dimension based.
4. The VLA-based low speed park autopilot method of claim 1 wherein step 2 comprises: step 21, carrying out semantic element segmentation on the panoramic park environment perception tensor, identifying four core semantic elements of a road area, pedestrians, park operation vehicles and fixed obstacles, and generating a park basic semantic segmentation result; step 22, based on the campus main body behavior database, performing dynamic movement trend prediction on the pedestrian and the campus work vehicle, and generating a dynamic main body movement trend prediction result; And step 23, fusing a park basic semantic segmentation result and a dynamic main body motion trend prediction result to generate a park environment semantic analysis topology with a dynamic motion trend, wherein the topology correlates a static semantic region with a dynamic main body motion track in a topological edge manner, and provides a constraint basis for path planning.
5. The VLA-based low speed park autopilot method of claim 4 wherein step 22 comprises: step 221, a campus main body behavior database is called, and the database prestores reference behavior data such as typical road crossing tracks, typical mode of park vehicle steering operation and the like of pedestrians related to high-frequency dynamic conflict in the campus to form a campus main body reference behavior data set; Step 222, inputting the real-time position and the motion speed of the dynamic main body into a behavior matching model, and carrying out feature matching and probability calculation based on a campus main body reference behavior data set to obtain a dynamic main body behavior trend index matrix; step 223, generating a dynamic main body movement trend prediction result based on the dynamic main body movement trend index matrix and the dynamic main body prediction track deduced based on the campus main body reference movement data set, wherein the prediction time domain of the result is adapted to the response aging requirement of low-speed automatic driving, and dynamic constraint is provided for semantic topology.
6. The VLA-based low speed park autopilot method of claim 1 wherein step 3 comprises: Step 31, executing automatic driving feasible region and dynamic constraint boundary recognition on the campus environment semantic analysis topology with dynamic motion trend to generate an initial driving path; Step 32, performing advanced obstacle avoidance planning on an initial driving path, and presetting a deceleration detour track aiming at a high-probability dynamic conflict scene to obtain an intermediate path with an obstacle avoidance strategy; And 33, performing low-speed stability optimization on the intermediate path with the obstacle avoidance strategy, adding dynamic safety redundancy, converting the optimized path into steering, accelerating and decelerating control instructions, generating a park automatic driving control instruction set with the dynamic safety redundancy, wherein the instruction set links control parameters including steering angle, acceleration and braking intensity with a dynamic main body movement trend prediction result, and directly driving the vehicle to execute driving actions.
7. The VLA-based low speed park autopilot method of claim 6 wherein step 33 comprises: Step 331, performing curvature smoothing optimization on the intermediate path with the obstacle avoidance strategy, and ensuring that the curvature of the path meets the requirement of low-speed running stability to obtain a smoothed obstacle avoidance path; step 332, adding dynamic safety redundancy for the control parameters corresponding to the smoothed obstacle avoidance path, including prolonging the braking response time and expanding the obstacle avoidance lateral distance, and generating a control parameter configuration table with the safety redundancy; Step 333, converting the control parameter configuration table with safety redundancy into steering, accelerating and decelerating control instructions, and generating a park autopilot control instruction set with safety redundancy, wherein the parameter interval of the instruction set is matched with the low-speed park running safety specification.
8. The VLA-based low speed park autopilot method of claim 1 wherein step 4 comprises: step 41, collecting vehicle running track data, obstacle avoidance effect evaluation data, sensing data matching data multimode data fusion matching degree data and dynamic prediction accuracy data, and generating a standardized running state feedback tensor through normalization processing; Step 42, inputting the standardized running state feedback tensor into a multi-mode fusion-path planning closed-loop collaborative optimization model, and adjusting the sensor weight of the multi-mode space-time calibration-weight conflict resolution fusion framework to obtain a sensor weight adjustment parameter configuration table; And 43, correcting dynamic trend prejudgement logic and optimizing safety redundancy parameters based on a standardized driving state feedback tensor, fusing a sensor weight adjustment parameter configuration table and the correction optimization result to generate a multi-mode-path planning collaborative optimization parameter tensor, and updating an automatic driving execution logic based on the parameter tensor.
9. The VLA-based low speed park autopilot method of claim 8 wherein step 42 comprises: step 421, based on the sensing data matching data multi-mode data fusion matching degree data in the standardized driving state feedback tensor, adjusting the weight distribution of the laser, visual, audio visual, voice and action state sensors, and adjusting the weight ratio of the high matching degree mode to obtain a sensor weight distribution adjustment result; Step 422, optimizing the time sequence alignment precision of the multi-mode time-space calibration based on the dynamic prediction accuracy data in the standardized driving state feedback tensor to obtain the time-space calibration precision optimization parameters; Step 423, generating a sensor weight adjustment parameter configuration table based on the sensor weight distribution adjustment result and the space-time calibration precision optimization parameter, wherein the parameter configuration table provides a direct basis for parameter updating of the multi-mode space-time calibration-weight conflict resolution fusion architecture vision-voice-action multi-mode space-time calibration-weight conflict resolution fusion architecture.
10. The VLA-based low speed park autopilot method of claim 8 wherein step 43 comprises: Step 431, correcting the behavior matching weight of the campus main body behavior database based on the dynamic prediction accuracy data in the standardized driving state feedback tensor to obtain a behavior database matching weight correction parameter; Step 432, optimizing safety redundancy parameters of the low-speed obstacle avoidance path based on the obstacle avoidance effect quantized data in the standardized driving state feedback tensor to obtain a safety redundancy parameter optimization result; step 433, matching the weight correction parameters, the safety redundancy parameter optimization results and the sensor weight adjustment parameter configuration table by fusing the behavior database, and generating a multi-mode-path planning collaborative optimization parameter tensor which performs dimension based on the optimization parameters of each module according to the priority, wherein the priority ordering is strongly associated with the automatic driving safety index.

Description

VLA-based low-speed park automatic driving method Technical Field The invention relates to the technical field of automatic driving, in particular to a low-speed park automatic driving method based on VLA. Background The running safety is ensured, the passing efficiency and riding comfort are also considered, and strict requirements are provided for the environment understanding, decision response and self-adaptive adjustment capability of the system. In particular, in a person-dense area, uncertainty of pedestrian behaviors, burstiness of voice interaction demands and the like, the system is required to have multidimensional sensing and prospective decision-making capability. The existing automatic driving scheme of the low-speed park mainly relies on a visual sensor to collect road image information, a semantic segmentation algorithm is adopted to identify roads and obstacles, a simple voice command receiving module is adopted to only respond to a control command of a fixed sentence pattern, and vehicle action state data are only used for real-time control feedback and do not participate in optimization of perception and decision logic. According to the scheme, an environment image is acquired through a camera, key elements are identified, a driving path is generated according to a preset rule, and finally a control instruction is adjusted according to state data such as the current speed and the gesture of a vehicle. However, the scheme has obvious technical defects that a vision, language and movement end-to-end fusion framework is not constructed, vision perception is easily influenced by illumination and shielding, voice interaction only stays on an instruction receiving layer and cannot be linked with environment perception and action state depth, pre-judgment on dynamic main body movement trend is lacking, decisions depend on real-time perception results and response lag, a closed-loop optimization mechanism is not adopted, vehicle action state data does not act on perception and decision logic adjustment in a reverse mode, adaptability and robustness of the system under a complex scene are weak, and safety and intelligent requirements of automatic driving in a low-speed park are difficult to meet. Disclosure of Invention In order to solve the above technical problems, the present application provides a VLA-based low speed park automatic driving method, so as to at least alleviate the above technical problems. The technical scheme provided by the embodiment of the application is as follows: A low-speed park automatic driving method based on VLA comprises the steps of 1, collecting road image data, obstacle laser point cloud data, environment audio data, park voice interaction data and vehicle action state data of a park environment to generate a panoramic park environment perception tensor, 2, executing semantic element recognition and dynamic main body movement trend pre-judgment on the panoramic park environment perception tensor to generate a park environment semantic analysis topology with dynamic movement trend, 3, executing advanced obstacle avoidance path planning and stability path optimization on the park environment semantic analysis topology with dynamic movement trend and a preset driving path to generate a park automatic driving control instruction set with dynamic safety redundancy, and 4, executing the park automatic driving control instruction set with dynamic safety redundancy and collecting vehicle running state feedback data to generate a multi-path planning collaborative optimization parameter tensor, and continuously adjusting automatic driving execution logic based on the multi-mode-path collaborative optimization parameter modal planning tensor. The application has the following technical advantages: According to the technical scheme, the comprehensive integration and association of the multi-mode data are realized by collecting road images, obstacle laser point clouds, environmental audios, park voice interaction and vehicle action state data and generating the panoramic park environment perception tensor. The traditional scheme mainly takes vision or vision and laser as the leading, ignores intention information in voice interaction, environmental clues in audio and linkage value of vehicle action state and environment perception, and leads to more perception blind areas in complex scenes. The multi-type data acquisition and tensor fusion can utilize complementarity of different mode data, for example, voice interaction data can capture personnel cooperation intention, environment audio data can assist in positioning abnormal risk sources, vehicle action state data can calibrate relative relations of environment perception, so that the dimension of the environment perception is richer, the relevance is stronger, specific scene requirements of mixed running and frequent interaction of people and vehicles in a low-speed park are met, and the problem of insufficient scene suita