CN-121744243-B - Vehicle track prediction and monitoring method based on implicit future interactive learning guidance of discriminator

CN121744243BCN 121744243 BCN121744243 BCN 121744243BCN-121744243-B

Abstract

The invention relates to the field of automatic driving track prediction, in particular to a vehicle track prediction and monitoring method based on implicit future interactive learning guidance of a discriminator. The method comprises the following steps of 1, designing a training framework based on cGAN track prediction models, 2, designing a subtask network PRNet for evaluating future interaction relations among vehicles, and 3, alternately training a generator G and a discriminator D based on dynamic weight strategies. Step 4, realizing model weight reduction and deployment optimization of the track prediction generator model G obtained through training through quantitative perception training, and step 5, quantifying the prediction model obtained through step 4 The method is deployed in an embedded computing platform of a vehicle at a vehicle-mounted equipment end so as to realize real-time prediction and reasoning, and the result is visualized at a real vehicle and a monitoring end, so that advantages are achieved in track accuracy and social compliance of tracks.

Inventors

GAO ZHEN
WANG LIYOU
CHEN XIAOWEN
XU JINGNING
HANG PENG
YU RONGJIE

Assignees

同济大学

Dates

Publication Date: 20260505
Application Date: 20260227

Claims (9)

1. A vehicle track prediction and monitoring method based on a discriminator implicit future interactive learning guidance is characterized by comprising the following steps: Step1, designing a training frame based on a cGAN frame track prediction model; The cGAN framework comprises a generator G and a discriminator D, and a track prediction model is obtained through countermeasure training of the generator G and the discriminator D; Wherein the generator G is used for track prediction, takes vehicle history track characteristics and map characteristics as condition information y, and inputs the condition information y and noise vectors sampled from random noise distribution into the generator to generate a predicted future track z; the discriminator D is responsible for supervising and judging the generation effect of the generator G, and the predicted future track z, the condition information y and the real future track x are respectively input into the discriminator D, and the discriminator carries out the true and false judgment; In the discriminator D, predicting the future interaction relationship among vehicles through a subtask network PRNet of the future interaction relationship, and taking the loss pr_loss of future interaction relationship evaluation into consideration in addition to the loss d_loss of the countermeasure training as a total loss function; Step 2, designing a subtask network PRNet for evaluating future interaction relations among vehicles, and calculating the loss of the future interaction relation evaluation; Aiming at future interaction relation evaluation in the discriminator, a subtask network PRNet for evaluating the future interaction relation among vehicles is designed, the interaction probability rr of the real future track among vehicles calculated based on the topology rule of the road is used as a label, and the future interaction relation among vehicles is guided to be focused by the network by minimizing the difference between the future relation predicted value pr and the label rr; step 3, alternately training a generator G and a discriminator D based on a dynamic weight strategy; Step 4, model quantization The track prediction generator model G obtained through training is subjected to quantitative perception training to realize model light weight and deployment optimization; step 5, real vehicle deployment and predicted track visualization The prediction model obtained by the quantization in the step 4 The system is deployed in an embedded computing platform of a vehicle at a vehicle-mounted equipment end so as to realize real-time prediction and reasoning and visualize results at a real vehicle and a monitoring end; The step 2 comprises the following steps: step 2.1, designing and evaluating a network structure of a subtask of the future interaction relationship; The future interaction relation evaluation network PRNet predicts the future interaction relation among vehicles, and the number of the vehicles is assumed to be N, and the input of the number of the vehicles is the fusion characteristics of the historical tracks, the predicted tracks or the real tracks of a plurality of vehicles in the same scene and the map Outputting an evaluation matrix pr for the interaction relationship among vehicles in a future period; The network processing process comprises merging vehicle features into one feature Features of Features representing vehicles i Features with vehicle j Splicing, wherein the splicing characteristics sequentially pass through a plurality of network processes consisting of a linear full-connection layer and an activation layer thereof, a fraction is output through an output layer, the outputs of all vehicle combinations are arranged into an N multiplied by N matrix, and each row of the matrix is normalized to obtain a probability matrix pred_proximity, namely pr, between 0 and 1; step 2.2, calculating a true value of the future interaction relationship as a training label; tags for future interactions are generated using rule calculations based on road topology: Assuming the number N of vehicles and the number M of road center lines, firstly, obtaining an initial occupancy rate matrix of each vehicle on a road center point according to the Euclidean distance from the vehicle to the road center 1 Is occupied, 0 is unoccupied; Then, to further model the potential movement intent of the vehicle, the initial road occupancy of the vehicle is determined based on the preceding, succeeding, left-adjacent, friendly neighbours topological relationships between lanes and the prior probability that the vehicle appears on the adjacent road Propagated to its adjacent lanes in a weighted manner to obtain a smooth occupancy The sum of the occupancy rate of the vehicles on all roads is 1, the actual meaning of the smooth occupancy rate is the probability that the vehicles appear on the central line positions of all roads at the next moment, and the smooth occupancy rate matrix is realized because of the higher interaction possibility between the vehicles with high occupancy probability on the central line of the same road Multiplying the two to obtain the symmetric interaction probability matrix between vehicles ; Finally, normalization is carried out to obtain the ratio rule_probability of interaction possibility with all other vehicles under a certain vehicle view angle, and the interaction relation is more in line with the actual situation; the calculation process is as follows: Wherein, the Representing a matrix of coordinates of the vehicle, A coordinate matrix representing the center line of the roadway, The function of the distance is represented as such, Representing the distance threshold value(s), Representing a topological relation matrix, wherein the topological relation is divided into a preceding pre, a following sub, a left adjacent left and a right adjacent right relation, the preceding and the following relations are divided into a direct adjacent node and a second order adjacent node, and the first order adjacent node is up to a fifth order adjacent node, and for each topological relation, a 0/1 matrix with M multiplied by M is used Representing and multiplying the respective influence coefficients And then adding, wherein the influence coefficient is subjected to back propagation continuous iterative optimization in training, and the weighted summation calculation is carried out on the topological relation among roads to obtain The formula is as follows: Wherein the influence coefficient Representing the weight of a certain adjacent topological relation in relation calculation, wherein the coefficient is dependent on the probability that the vehicle appears on the adjacent topological road section at the next moment; Representing a road appearing before, a road appearing after, a road appearing on the left side, and a road appearing on the right side, respectively; step 2.3, calculating the loss of the future interactive relation evaluation of the vehicle; Calculating an error pr_loss between a future interaction probability pred_proximity and an interaction probability label rule_proximity, training PRNet to capture potential future interaction relations among vehicles by reducing the error, and enabling a generator to implicitly learn the interaction relations among the vehicles in a feature extraction and feature fusion stage by resisting gradient back propagation in a training process, wherein pr_loss is designed as follows: 。
2. the method for predicting and monitoring a vehicle track based on implicit future interactive learning guidance of a arbiter as claimed in claim 1, wherein, in step 1, The overall goals of the challenge training are as follows: Wherein, the Representing the generation of a cost function in the antagonism network, Representing when a sample is input From real data distribution Time, discriminator The expected value of the logarithmic probability of making the correct determination thereof Representing when the input sample is input by the generator From the noise prior When generating, the discriminator The expected value of the logarithmic probability for which a correct determination is made for the arbiter The higher the probability expectation is for making a correct judgment, the better, i.e And (3) with And the generator G is targeted exactly opposite, i.e And (3) with Is minimized by the addition of (a); in the countermeasure training, G and D are alternately trained to reach Nash equilibrium state, in order to save reasoning time, only a discriminator D is used in training, and in practical application, only a generator G is used for track prediction.
3. The method for predicting and monitoring a vehicle track based on implicit future interactive learning guidance of a arbiter according to claim 1, wherein step 1 comprises: step 1.1, designing a basic structure of a generator G; the generator G performs feature extraction and feature fusion on the input historical track and map information, and takes the historical track and map information as condition information y of the generator; The specific process is as follows: In the characteristic fusion stage, the characteristic fusion is carried out on the track and the map in the time and space dimensions by adopting a convolution network or a cross attention network; sampling and processing Gaussian noise, namely randomly generating a vector with a specific length, merging the vector with the fused track and map features into a new feature vector, and taking the new feature vector as an input of a track generating module; generating a predicted track, namely firstly processing the combined feature vectors by using an attention mechanism, and then outputting the predicted tracks of all vehicles by using a predicted network; Step 1.2, designing a loss function of the generator G; the loss function of G is: the regression loss reg_loss calculates the error between the generated track and the real track in the multi-mode track prediction, and the formula is as follows: Where N represents the number of all vehicles in a scene, K represents the K different multi-modal trajectories generated, T represents the predicted time, i.e., the total number of frames, Representing the coordinates of the kth predicted track of the ith vehicle at time t, Representing the coordinates of the real track of the ith vehicle at the time t; The classification loss cls_loss is used for optimizing the confidence evaluation of the generator on the tracks, so that the closest one of the K generated modes to the real track is given the highest confidence score, and the accuracy of track sequencing is improved; for classification loss, each vehicle is first calculated Predicted trajectories in each modality k of (c) And true track Average distance of (2) : Then selecting the mode with the smallest average distance as the optimal mode: Finally, the classification penalty is defined as the negative log-likelihood average of the best mode confidence on all vehicles: Wherein, the Representing a confidence score given by the generator G to a predicted track of the kth mode of the ith vehicle, wherein the score is obtained by mapping track features through MLP; Track confidence scores corresponding to the selected best modality; and g_loss is the loss of G generated in the countermeasure training, and the calculation formula is as follows: Step 1.3, designing a basic structure of a discriminator D; The discriminator D is used for carrying out rationality scoring on the generated result of the generator G and evaluating the future interaction relationship, wherein the input is the condition information y, the real future track x and the predicted future track output by the generator G ; Firstly, carrying out feature extraction and feature fusion on input, then respectively inputting features into a neural network for scoring and future interaction relation evaluation, and finally outputting a track true-false score between 0 and 1 and an interaction probability matrix between every two vehicles; The method comprises the following steps: feature extraction and fusion, firstly, historical track and predicted future track are combined Or the history track and the real track x are spliced in the time dimension respectively, the characteristics of the spliced complete track and the map are extracted respectively, and then the characteristics are fused; the true and false scoring is that a neural network is used for processing the fusion characteristics, then a score of 0-1 is output by using a full-connection layer, and the closer the score is to 1, the more true the discriminator considers the track to be; The method comprises the following steps of performing future interaction relation evaluation, namely splicing the characteristics of a plurality of vehicles in pairs, processing the spliced characteristics by using a neural network, and using a value of 0-1 output by a full-connection layer to represent the prediction of interaction probability in future time, wherein the closer to 1, the more likely interaction occurs; Step 1.4, designing a loss function of the discriminator D; For the arbiter D, in addition to the loss d_loss of countermeasure training, the loss pr_loss of future cross-correlation evaluation needs to be increased, and the total loss function The following formula is defined: Wherein d_loss is the loss generated by the discriminator D in the countermeasure training, and the calculation formula is as follows: pr_loss is the error between the predicted future interaction probability and the interaction probability label, calculated by the subtask network PRNet of the future interaction relationship.
4. The method for predicting and monitoring a vehicle track based on implicit future interactive learning guidance of a arbiter according to claim 1, wherein step 3 comprises: step 3.1, training data processing; Before training, vectorizing the input map data and historical track data, and calculating the true value of the future interaction probability between vehicles by using a rule-based method; Specifically, in vectorization of vehicle history data, firstly, the coordinates of all vehicles are converted into a relative coordinate system with the coordinates of a target vehicle in the last frame of history time as an origin and the driving direction as an x axis, then, the increment of the x and y coordinates of each frame under the relative coordinate system is calculated for each vehicle, finally, time alignment is required, and for a vehicle observed in a certain frame, the effective position 1 is set, and the coordinates which are not observed are set as (0, 0) and the effective position 0; in the vectorization of map data, the coordinate system of the central line of the road is also required to be converted into a relative coordinate system with the coordinate of the target vehicle in the last frame of the history as an origin and the running direction as an x-axis, then the displacement of each central line coordinate point relative to the last is calculated, and the vector of the relative coordinate plus the displacement is used for representing a road; Calculating the distance between the coordinates of the last frame of each vehicle in the future and the central line of each road, calculating the occupancy of each vehicle to each road, smoothing the occupancy matrix, then taking the occupancy matrix as a self-help, setting the diagonal line to be zero, and normalizing to obtain an interaction probability matrix between vehicles, wherein the interaction probability matrix is used as a true value; Step 3.2, fixing the parameters of G, and training a discriminator D; step 3.2.1 the generator G predicts the future trajectory; firstly, in a generator G with fixed parameters, using a self-attention network or a convolution network to extract the historical track characteristics and map characteristics of all vehicles, using a cross-attention mode and the like to fuse the characteristics, using the fused characteristics as condition information y, then splicing the condition information y and random noise vectors, inputting the condition information y into a prediction network of the G, and outputting a predicted future track ; Step 3.2.2 the discriminator inputs the false sample; Processing predicted future trajectories Splice it with history track to complete track I.e., a false sample; will complete the track Inputting the condition information y into a discriminator D to obtain a pair of false samples Simultaneously obtaining evaluation matrix of future interaction relation between D and vehicles ; Step 3.2.3, inputting a true sample by the discriminator; inputting a real future track x, and splicing the real future track x and the history track into a complete track I.e. sample, will be the real future track Inputting the condition information y into a discriminator D to obtain a D-pair true sample Is a scoring matrix of future interaction probabilities ; Calculating the loss of the discriminator D in the step 3.2.4; the objective of the arbiter D is to better distinguish between true and false samples, so that the true value of the false sample is set to 0, the true value of the true sample is set to 1, and the sum of the average error between the scores of the true and false samples output by D and the respective true values is calculated as the loss d_loss of the true and false scoring module ScoreNet; directly calculating average errors between pr1 and pr2 and the true value, and taking the average errors as loss pr_loss of the future interaction relation evaluation module; step 3.2.5 DWA algorithm adjusts the loss weight of D and updates parameters; Starting from the data of the second batch, calculating the change rate of the d_loss and the pr_loss loss recorded in the batch and the previous batch each time, reversely distributing weights according to the change rate, and adding the weighted losses to obtain the loss of D; After the loss of D is obtained, the parameter of D is updated by back propagation; step 3.3, fixing the parameters of D, and training a generator G; step 3.3.1 the generator G predicts the future trajectory; the process is the same as that of step 3.2.1; Step 3.3.2, inputting a false sample by the discriminator; The process is the same as that of step 3.2.2; Step 3.3.3 calculation of the loss of generator G; The object of the generator G is opposite to the arbiter D, namely the false sample is identified as true by the generator D, then the true value of the false sample is set as 1, and the average error between the score of the false sample output by the generator D and the true value is calculated as the loss g_loss of the generator G in countermeasure training; In order to avoid mode collapse, and maintain the accuracy of the predicted trajectory, it is also necessary to calculate the regression loss reg_loss and classification loss cls_loss between the multimodal trajectory predicted by the generator G and the real future trajectory; step 3.3.4 DWA algorithm adjusts the loss weight of G and updates parameters; For the first batch of training data, g_loss, reg_loss and cls_loss are directly added to obtain G loss, and from the second batch of data, the change rate of the G_loss, reg_loss and cls_loss recorded in the batch and the previous batch is calculated each time, weights are reversely distributed according to the change rate, and the weighted losses are added to obtain G loss; After the loss of G is obtained, the parameter of G is updated by back propagation; Step 3.4, training alternately until the loss of G and D converges; The DWA algorithm is training of each batch of data, calculates the loss of each subtask, calculates the error change rate according to the loss of each subtask in the last back propagation, and inversely distributes weights according to the change rate, so that the weights of the subtasks reach dynamic balance to realize collaborative optimization, wherein the weight updating formula of the dynamic weight average algorithm is as follows: Wherein, the Representing tasks The ratio of the loss at the time t-1 to the loss at the time t-2 reflects the convergence rate of the loss of the task; indicating the loss weight which can be allocated to task i at time t, which is defined by time t-1 Index of (2) and other tasks I represents the number of tasks, T is the temperature coefficient, when T is sufficiently large, The weights between the tasks are equal.
5. The method for predicting and monitoring a vehicle track based on implicit future interactive learning guidance of a arbiter according to claim 1, wherein step 4 comprises: Step 4.1 obtaining a Floating Point model Step 4.1.1, acquiring an optimal weight parameter set from the generator G which reaches a convergence state in the step 3 through countermeasure training, and loading the parameter set into a prototype network structure which is isomorphic with the generator G to obtain an original floating point model with complete logic structure and parameter weight; step 4.1.2, structurally modifying the floating point model: Encapsulating an input layer of the original floating point model by using a QuantStub module, constructing a quantization interface for converting input features from floating point numbers to fixed point numbers, encapsulating an output layer of the original floating point model by using a DeQuantStub module, and constructing an inverse quantization interface for converting output results from fixed point numbers to floating point numbers, so as to obtain a modified floating point model; Step 4.2 performing quantization calibration on the model Step 4.2.1, dynamically monitoring the activation value and the weight of each layer of the reconstructed floating point model in the forward reasoning process by using an Observer module, extracting the numerical boundary of each layer of data flow by using the Observer module by traversing a calibration data set, and taking the numerical boundary as the statistical basis for calculating the scale factors and the zero points, wherein the numerical boundary comprises a minimum value, a maximum value and a distribution rule; Step 4.2.2, performing forward reasoning on the floating point model for multiple times by using the calibration data set, accumulating the observed data distribution, calculating to obtain optimal quantization parameters of each layer, and establishing a mapping relation between floating point data and fixed point data; step 4.2.3, verifying the result of the calibrated floating point model with the quantization parameter on the preset evaluation index, wherein the evaluation index comprises minADE, minFDE and MR: If the verification result meets the preset requirement, step 4.2.3.1 is performed to convert the floating point model with quantization parameters after the calibration into the fixed point model based on the calibration result And is used for subsequent reasoning deployment; Step 4.2.3.2, if the verification result does not meet the preset requirement, using the calibration result as initialization, and entering step 4.3 to execute quantized perception training to further optimize the quantized reasoning precision; step 4.3 model quantized perception training Step 4.3.1 starts pseudo quantization nodes in the training phase, which are used for simulating fixed-point quantization and inverse quantization numerical behaviors in the forward reasoning process, so that the model can consider numerical errors caused by quantization in the training phase of step 4.3.2; Step 4.3.2, in the process of quantized perception training, keeping the original track prediction task target unchanged, keeping the pseudo quantization node in an enabled state, and carrying out self-adaptive gradient compensation according to the numerical deviation generated by pseudo quantization through the weight of an error feedback driving model so as to keep the training process consistent with the numerical characteristics of fixed point reasoning; after training is finished in step 4.3.3, removing the pseudo quantization nodes, solidifying the quantization parameters and the model weights, and finally obtaining the fixed-point model for reasoning And exporting the obtained fixed-point model into hbm files for subsequent deployment of the real vehicle board end.
6. The method for predicting and monitoring a vehicle track based on implicit future interactive learning guidance of a arbiter according to claim 1, wherein step 5 comprises: step 5.1 off-line verification and consistency check In a fixed point model Before the vehicle runs, firstly, the fixed point model is aimed at Performing off-line verification and consistency check; Step 5.2 vehicle-mounted board end operation and prediction track visualization By utilizing a multithreading asynchronous link organization, through two parallel logic links of 'state maintenance' and 'prediction reasoning', particularly, a dynamic maintenance task of data access and environment state is executed by a state maintenance thread (abbreviated as a thread A), and a track prediction reasoning task is driven by a periodic trigger mechanism of a prediction reasoning thread (abbreviated as a thread B), so that complete closed loop from data access to result display is realized.
7. The method for predicting and monitoring a vehicle track based on implicit future learning guidance of a arbiter of claim 5, wherein step 5.2 comprises: step 5.2.1 dynamic maintenance of data access and environmental status: 5.2.1.1 data access: The vehicle-mounted terminal receives the structured environment information output by the upstream sensing module in real time by utilizing the thread A, wherein the structured environment information comprises the motion state information of the vehicle and the motion state information of surrounding vehicle bodies, and further the structured environment information is continuously written into a message buffer area of the vehicle-mounted terminal; 5.2.1.2 dynamic maintenance of environmental states: Analyzing the message buffer data by utilizing a target maintenance logic module of the vehicle-mounted terminal, and updating and storing an analysis result in real time to a shared state library so as to obtain global vehicle information, wherein a target identifier is used as an index during analysis, a historical window sequence and a corresponding timestamp are contained in the analysis in a sliding window mode, so that a global vehicle information structure is formed, and dynamic maintenance of an environment state is realized; step 5.2.2, periodically triggering and generating data snapshot: at the triggering moment, the thread B performs one-time reading and copying on the data required by the current period from the shared state library by utilizing a concurrency control mode, and generates an independent data snapshot corresponding to the global vehicle information of the current period, wherein the data snapshot comprises a self-vehicle state and target information; Step 5.2.3 feature pretreatment and standardized input construction: thread B builds a normalized tensor based on the data snapshot described in step 5.2.2 as a step 5.2.4 fixed-point model Is input to the computer; Step 5.2.4 fixed-point model reasoning Using thread B to enter the normalized tensor input into a fixed-point model Performing reasoning and outputting an original prediction result, wherein the original prediction result comprises multi-mode tracks in a future time domain and confidence information corresponding to each track; step 5.2.5 post-processing of results and Standard Release Thread B combines current time target state information with fixed point model by post-processing logic The output execution coordinates are restored, and a predicted track is extracted according to a confidence level screening rule; Then, the predicted track is packaged into a standard communication format and is released to a downstream vehicle planning and control module, wherein release content comprises a predicted track sequence, a confidence coefficient and an associated timestamp so as to ensure that the downstream planning and control module can realize time-aligned decision consumption; step 5.2.6 time consuming statistics and prediction visualization The thread B realizes real-time monitoring of the full-link processing time consumption and visual rendering of a prediction result through a parallel time-consuming statistics sub-module and a track visual monitoring sub-module, and realizes a fixed-point model And running closed loop and interactive display at the vehicle-mounted end.
8. The method for predicting and monitoring vehicle trajectories based on implicit future learning guidance of a arbiter as set forth in claim 7, Step 5.2.3 of feature preprocessing and standardized input construction, specifically comprising: step 5.2.3.1 state alignment and reference frame conversion: Aiming at each target in the snapshot, searching the motion state in a history window according to time sequence, and executing state alignment processing by utilizing a space transformation operator, wherein the alignment processing is to uniformly convert the coordinate positions and the orientation angles of each target at different history sampling moments into a preset reference coordinate system taking a vehicle position at the current moment as an origin; Step 5.2.3.2 feature construction and dimension normalization: performing feature construction by using feature engineering logic, mapping and calculating the motion state of each target aligned in step 5.2.3.1, specifically resolving the original physical quantities of historical displacement, speed and orientation angle to generate feature vectors containing time-space evolution rule, thereby making the follow-up fixed-point model Meanwhile, the dimension normalization processing is carried out for the targets with inconsistent historical frame numbers, and the length alignment of the feature vectors of all targets on a time axis is realized by zero filling of the missing frames; Step 5.2.3.3 normalized tensor construction: The feature vector produced in the step 5.2.3.2 is arranged and packaged according to the dimension of 'target-time-feature' by utilizing a preassigned continuous memory buffer area, and a fixed-point model is directly constructed by mapping discrete features into a continuous memory space The normalized tensor required by the layer is input.
9. The method for predicting and monitoring vehicle trajectories based on implicit future learning guidance of a arbiter as set forth in claim 8, wherein, Step 5.2.6 time consuming statistics and visualization of prediction results: firstly, performing performance monitoring on key nodes of each reasoning period by using a time-consuming statistics sub-module, wherein the key nodes comprise the whole process of executing the steps of 5.2.2 data snapshot generation, 5.2.3 characteristic preprocessing and standardized input construction, 5.2.4 fixed-point model reasoning, 5.2.5 result post-processing and standard release in a thread B; and secondly, the track visualization monitoring submodule receives the predicted track data released in the step 5.2.5 in real time and performs dynamic rendering in a vehicle display terminal or a remote monitoring system.

Description

Vehicle track prediction and monitoring method based on implicit future interactive learning guidance of discriminator Technical Field The invention relates to the field of automatic driving track prediction, in particular to a vehicle track prediction and monitoring method based on implicit future interactive learning guidance of a discriminator. Background In a real-vehicle automatic driving system, a track prediction module is usually used as a functional unit for online running of a vehicle-mounted end, and under the condition that upstream perception data is continuously accessed, feature construction, model reasoning and result output are stably completed in a preset period, and a prediction result is provided for a downstream planning and control module in a usable form. The running of the track prediction module needs to rely on excellent track prediction models and algorithms for the prediction performance of the future interactive vehicle tracks. The automatic driving vehicle track prediction not only needs to consider the historical interaction relationship, but also ensures that the vehicle interaction in the prediction period is reasonable, namely the model needs to consider the future interaction of the vehicle, including the behaviors of meeting games, gifts and the like, so as to avoid the problem that the predicted tracks cross collision or overlap, and the predicted multi-vehicle track has more social compliance. According to the future interaction modeling mode and the track output mode, the future interaction method is divided into three types of iterative feedback future interaction method based on the track correction method, future interaction method based on synchronous prediction and condition prediction future interaction method based on dependency judgment. The model based on the track correction method generally predicts future tracks or intentions of a plurality of agents respectively, inputs the initially predicted tracks or intentions into the correction module, and can explicitly consider the influence of potential future actions of other agents on the track of each agent, and corrects the track of each agent on the basis. Although the method provides a future interactive game method through an iterative feedback mechanism, the method has interpretability, the real-time performance is poor because the track is updated by forward propagation for many times during reasoning, the method is difficult to stably run at a vehicle-mounted end in a fixed period, and continuous online verification in a real vehicle system is also not facilitated. Future interaction methods based on synchronous prediction predict the joint distribution track of all multi-agents at one time. Such methods predict joint distribution of multiple agents by directly modeling joint probability distribution of future states, or extracting features of future interactions through an attention mechanism. The method is favorable for improving the consistency and rationality of the prediction result in the aspects of physical and social ethics, but has high calculation complexity, the dimension of joint distribution increases exponentially with the number of the intelligent agents to cause the rapid increase of training and reasoning cost, secondly, the training depends on a large amount of high-quality interaction scene data and has limited generalization capability for unobserved interaction modes, and finally, compared with the method of the interaction correction track, the synchronous prediction method based on the black box has lower interpretation, is difficult to provide explicit interpretation of interaction strategies and is unfavorable for the debugging and verification of a vehicle-mounted terminal. The method for conditional prediction based on the dependency relationship explicitly defines the relationship between the influencers and the respondents by constructing a marginal-conditional framework, converts interaction logic into a traceable structural relationship, breaks through the limitation of implicit modeling, and enables the model decision process to be more in line with the cognitive logic of human beings on traffic scenes. On the aspect of calculation complexity, on one hand, the combined explosion problem of the combined prediction is decomposed by the condition prediction, the problem is split into a plurality of chained sub-problems through a layered prediction strategy, so that the calculation complexity is reduced to a linear or polynomial level, the expandability of a large-scale scene is effectively improved, on the other hand, the condition prediction needs an explicit relation prediction module and layered prediction, the calculation cost is high, and the method is not applicable to the scene with high real-time requirements. And during training, the acquisition of the relation labels depends on manual labeling, so that the cost is high on a large-scale data set. In addition, chain p