CN-121999614-A - Traffic track prediction system and method based on global space-time feature map

CN121999614ACN 121999614 ACN121999614 ACN 121999614ACN-121999614-A

Abstract

The invention discloses a traffic track prediction system and method based on a global space-time feature map, wherein the system comprises a global space-time fusion layer, a space-time map convolution layer and a track prediction layer, the global space-time fusion layer is used for calculating and outputting an interactive adjacency matrix to the space-time map convolution layer according to input historical track information, the space-time map convolution layer comprises a plurality of space-time map convolution modules which are sequentially connected, the historical track information and the interactive adjacency matrix are processed by the plurality of space-time map convolution modules which are sequentially connected, and then space-time feature representation is output to the track prediction layer, the space-time map convolution modules comprise a multi-scale time convolution module and a double-channel space convolution module, and the track prediction layer is used for predicting through a pre-training track prediction model according to the space-time feature representation and outputting to obtain predicted track information. The method and the system can improve the efficiency and accuracy of traffic track prediction in the complex traffic scene.

Inventors

WANG YIFAN
Ou Dongyuan
LUO WEINAN

Assignees

广东工业大学

Dates

Publication Date: 20260508
Application Date: 20260211

Claims (10)

1. The traffic track prediction system based on the global space-time feature map is characterized by comprising a global space-time fusion layer, a space-time map convolution layer and a track prediction layer; The global space-time fusion layer is used for calculating and outputting an interaction adjacency matrix to the space-time diagram convolution layer according to input historical track information, wherein the historical track information is the historical track of all traffic participants in a scene to be predicted; The space-time diagram convolution layer comprises a plurality of space-time diagram convolution modules which are connected in sequence, and the history track information and the interaction adjacent matrix are processed by the plurality of space-time diagram convolution modules which are connected in sequence, and then the space-time characteristic representation is output to the track prediction layer; And the track prediction layer is used for predicting through a pre-training track prediction model according to the space-time characteristic representation, and outputting and obtaining predicted track information.
2. The traffic trajectory prediction system based on global spatiotemporal feature map of claim 1, wherein the multi-scale temporal convolution module comprises a plurality of convolution kernels of different sizes and a first stitching layer, the outputs of the plurality of convolution kernels being connected to the inputs of the stitching layer.
3. The traffic trajectory prediction system based on global spatiotemporal feature map of claim 2, wherein at least one of the plurality of convolution kernels comprises a small-size convolution kernel having a size less than a preset kernel size threshold and a large-size convolution kernel having a size greater than the preset kernel size threshold.
4. The traffic trajectory prediction system based on global spatiotemporal feature map of claim 1, wherein the two-channel spatial convolution module comprises a first spatial channel, a second spatial channel, and a second stitching layer; the first space passage comprises a first input channel, a plurality of information propagation layers and an information aggregation layer, wherein the output ends of the first input channel are respectively connected with the input ends of the information propagation layers, and the output ends of the information propagation layers are connected with the input ends of the information aggregation layer; The second space passage comprises a second input channel, a plurality of information propagation layers and an information aggregation layer, wherein the output ends of the second input channel are respectively connected with the input ends of the information propagation layers, and the output ends of the information propagation layers are connected with the input ends of the information aggregation layer; The output end of the first space passage and the output end of the second space passage are connected with the input end of the second splicing layer.
5. A traffic track prediction method based on the traffic track prediction system according to any one of claims 1 to 4, characterized by comprising the steps of: collecting historical track information of all traffic participants in a scene to be predicted, and generating an interaction adjacency matrix through the global space-time fusion layer according to the historical track information; inputting the history track information and the interaction adjacency matrix into the space-time diagram convolution layer, and outputting to obtain the space-time characteristic representation; And according to the space-time characteristic representation, predicting by the track prediction layer to obtain predicted track information.
6. The traffic track prediction method according to claim 5, wherein the step of collecting historical track information of all traffic participants in a scene to be predicted, and generating an interaction adjacency matrix through the global space-time fusion layer according to the historical track information, specifically comprises: Collecting position information of a plurality of past time steps of all traffic participants to obtain the historical track information; Flattening the historical track information to obtain a space-time characteristic sequence matrix; According to the space-time feature sequence matrix, calculating to obtain a global space-time feature matrix through a multi-head attention mechanism in the global space-time fusion layer; extracting a space-time feature matrix corresponding to the last time step from the global space-time feature matrix to obtain a priori feature matrix; and calculating the interaction adjacency matrix through a self-attention mechanism in the global space-time fusion layer according to the prior feature matrix.
7. The traffic track prediction method according to claim 5, wherein the inputting the historical track information and the interaction adjacency matrix into the space-time diagram convolution layer and outputting to obtain the space-time feature representation specifically comprises: inputting the historical track information into the multi-scale time convolution module of the first space-time diagram convolution module to perform expansion convolution processing, and splicing expansion convolution results to obtain time characteristic representation; Inputting the time characteristic representation and the interaction adjacency matrix into the two-channel space convolution module of the first space-time diagram convolution module, and calculating to obtain a space characteristic representation; the spatial characteristic representation output by the a-1 th space-time diagram convolution module is used as input data of the a-1 th space-time diagram convolution module, and after being processed by a plurality of space-time diagram convolution modules in sequence, the spatial characteristic representation of the last space-time diagram convolution module is output as the space-time characteristic representation, 。
8. The traffic trajectory prediction method according to claim 7, further comprising, after splicing the expanded convolution results: and processing the splicing result through the tanh function and the sigmoid function respectively, and splicing the processing results of the tanh function and the sigmoid function to obtain the time characteristic representation.
9. The traffic trajectory prediction method according to claim 7, wherein the calculation process of the spatial feature representation includes: according to the interaction adjacent matrix and the time characteristic representation, respectively performing space dependency calculation through a plurality of information propagation layers in the double-channel space convolution module, and correspondingly outputting to obtain a plurality of forward propagation results; aggregating all the forward propagation results to obtain forward space feature representation; According to the transpose matrix of the interaction adjacent matrix and the time characteristic representation, respectively performing space dependency calculation through the plurality of information propagation layers, and outputting to obtain a reverse propagation result; All the reverse propagation results are aggregated to obtain a reverse space feature representation; and splicing the forward space feature representation and the reverse space feature representation to obtain the space feature representation.
10. The traffic trajectory prediction method based on global spatiotemporal feature map of claim 5, wherein said pre-training trajectory prediction model is a GRU-based sequence-to-sequence prediction model comprising an encoder and a decoder, and wherein said prediction process of predicted trajectory information comprises: Carrying out time sequence modeling on the space-time characteristic representation through the encoder to obtain a hidden state vector; And decoding by the decoder according to the traffic participant position information of the last time step and the hidden state vector to obtain the predicted track information.

Description

Traffic track prediction system and method based on global space-time feature map Technical Field The invention relates to the field of track prediction, in particular to a traffic track prediction system and method based on a global space-time feature map. Background Track prediction is one of key core technologies in automatic driving and intelligent traffic systems, and the prediction accuracy and instantaneity of the track prediction directly influence the safety, reliability and traffic efficiency of an automatic driving vehicle in a complex traffic environment. In an autopilot scenario, vehicles need to continuously predict future motion states of surrounding vehicles, pedestrians, and other traffic participants to support the vehicles to complete tasks such as behavior decision, path planning, active safety control, and the like in a dynamic traffic environment. In order to realize safe driving and effectively avoid potential collision risks, an automatic driving vehicle not only needs to accurately sense the motion state of the automatic driving vehicle, but also needs to comprehensively understand the surrounding traffic environment, and high-precision modeling and deduction are carried out on the future motion trail and behavior mode of a traffic participant based on historical trail data. The predictive capability is the basic support for the autopilot system to implement global path planning, local motion control, and active safety protection mechanisms. However, current automated driving trajectory prediction techniques still face a number of challenges. On one hand, a large number of traffic participants exist in a traffic scene, the motion state of the traffic participants continuously changes along with time, the track prediction model needs to process large-scale time sequence data on the premise of ensuring real-time performance, and the calculation and storage cost is high, on the other hand, the traffic participants have complex time-space dependency relations, the motion behaviors of the traffic participants are influenced by historical motion trends of the traffic participants and are limited by the interaction behaviors of surrounding traffic participants, and the traditional method based on independent target modeling is difficult to effectively describe the interaction coupling relation among multiple targets. In addition, in an actual traffic scene, the road structure, the traffic density and the driving behavior have high uncertainty, which puts higher demands on the generalization capability and the prediction stability of the track prediction model in a complex scene. Therefore, how to efficiently model a plurality of traffic participants in a complex traffic scene and fully mine the inherent space-time interaction relationship is a key technical problem to be solved in the current automatic driving track prediction field. Disclosure of Invention The invention provides a traffic track prediction system and a traffic track prediction method based on a global space-time feature map, which are used for overcoming the defects of the prior art and improving the efficiency and the accuracy of traffic track prediction of a plurality of traffic participants in a complex traffic scene. The embodiment of the invention provides a traffic track prediction system based on a global space-time feature map, which comprises a global space-time fusion layer, a space-time map convolution layer and a track prediction layer; The global space-time fusion layer is used for calculating and outputting an interaction adjacency matrix to the space-time diagram convolution layer according to input historical track information, wherein the historical track information is the historical track of all traffic participants in a scene to be predicted; The space-time diagram convolution layer comprises a plurality of space-time diagram convolution modules which are connected in sequence, and the history track information and the interaction adjacent matrix are processed by the plurality of space-time diagram convolution modules which are connected in sequence, and then the space-time characteristic representation is output to the track prediction layer; And the track prediction layer is used for predicting through a pre-training track prediction model according to the space-time characteristic representation, and outputting and obtaining predicted track information. Further, the multi-scale time convolution module comprises a plurality of convolution kernels with different sizes and a first splicing layer, and the output ends of the convolution kernels are connected to the input ends of the splicing layer. Preferably, the plurality of convolution kernels at least includes a small-size convolution kernel having a size smaller than a preset kernel size threshold and a large-size convolution kernel having a size larger than the preset kernel size threshold. Further, the dual-channel spatial convolution module compri