CN-121982899-A - Method and equipment for predicting traffic flow by pre-training adaptive space-time coding mask
Abstract
The invention discloses a self-adaptive space-time coding mask pre-training traffic flow prediction method and equipment. The prediction stage adopts multi-scale trend to sense time attention and structure to sense space attention, and respectively captures traffic flow trend and road network structure characteristics. Both stages introduce adaptive space-time coding to enhance the space-time feature extraction. The intelligent traffic system and the intelligent traffic system based on the learning method effectively learn long-term dependence under low memory overhead, combine trend and structure perception, remarkably improve traffic flow prediction accuracy and support accurate operation of the intelligent traffic system.
Inventors
- Ruan Yiheng
- WANG LEZHANG
- LV XUELEI
- LUO LIN
- Ruan Mengxiong
- ZHANG LIJIE
- DENG XIANJUN
Assignees
- 湖北楚天高速数字科技有限公司
Dates
- Publication Date
- 20260505
- Application Date
- 20260407
Claims (10)
- 1. A self-adaptive space-time coding and mask pre-training enhanced traffic flow prediction method is characterized by comprising the following steps of S1, carrying out random masking on historical traffic flow data which is not less than one day in a time dimension by adopting a blocking processing mode, S2, inputting masked data into a mask self-coder for training, obtaining a pre-training time coder specially extracting long-term time dependence under the condition of reducing memory occupation through blocking processing and the time dimension mask, S3, adding time, date, road network space and self-adaptive space-time coding for short-term traffic flow data to form coded data, and S4, inputting the coded data into a customized downstream predictor for processing: The time trend features are input into a structural sensing space attention module, and the road network topology information is fused into attention calculation in a form of a learning bias matrix to extract space structural features as output features of the predictor; s5, extracting time characterization of the long-term historical traffic flow data which corresponds to the short-term traffic flow data and is subjected to block processing by utilizing the pre-training time encoder, and fusing the time characterization with the characteristics output in the step S4 to generate enhanced characterization; s6, merging the enhancement representation to predict future traffic flow.
- 2. The method of claim 1, wherein the partitioning of the historical traffic flow data in step S1 is performed by non-overlapping sliding convolutions in the time dimension using a one-dimensional convolution check to divide the long sequence into sequences of equal length data blocks.
- 3. The method of claim 1, wherein in step S3, the adaptive space-time coding is a learnable tensor initialized by the Xavier method and participating in gradient update during training, for adaptively fusing space-time context information of traffic flow.
- 4. The method of claim 1, wherein the multi-scale trending aware temporal attention module comprises at least two attention layers, wherein a first layer generates queries and keys using one-dimensional convolution kernels of size 1 and a second layer generates queries and keys using one-dimensional convolution kernels of size greater than 1.
- 5. The method of claim 4, wherein the one-dimensional convolution kernel greater than 1 has a size of 3 or 5.
- 6. The method of claim 1, wherein the road network topology information is a shortest path distance matrix between nodes calculated based on a road network graph structure in the structure-aware spatial attention module.
- 7. The method according to claim 1, wherein the fusing in step S5 is performed as follows: And (4) intercepting the last block representation output by the pre-training time encoder after the long-term history data is encoded, and adding the last block representation with the characteristics output in the step (S4) element by element after the last block representation is projected through a linear layer.
- 8. The method of claim 1, wherein the mask is optimized from the pre-training of the encoder and the overall training of the prediction stage in step S2 using a Huber loss function.
- 9. An electronic device comprising one or more processors and a memory for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-8.
- 10. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1 to 8.
Description
Method and equipment for predicting traffic flow by pre-training adaptive space-time coding mask Technical Field The invention belongs to the field of intelligent traffic, and in particular relates to a traffic flow prediction method and equipment with self-adaptive space-time coding and mask pre-training enhancement. Background Intelligent traffic systems play a vital role in modern smart cities, carrying the important tasks of predicting, planning and managing urban traffic. As a core technology of an intelligent traffic system, traffic flow prediction aims at predicting future flow from historical observation data. Accurate traffic flow predictions can guide route planning and alleviate traffic congestion, and are therefore widely studied. Nonetheless, current research still has the following two limitations. The first limitation is that it is difficult to fully model traffic flow for long periods. Traffic flow presents regular periodic changes with people's work and rest, and the complete period of traffic flow is up to 1 week. Most existing traffic flow prediction models are trained in an end-to-end fashion, and their input length is often limited to a small value (typically 1 hour, 12 steps). Compared to traffic flows with periods up to 1 week, existing prediction methods can only capture scattered time slices, possibly trapping spatio-temporal hallucinations due to complex spatio-temporal heterogeneity. Temporal ghosts refer to different input windows of the same sensor having similar flow sequences, but different flow sequences in the predicted window. Spatial illusion refers to two sensors located at different locations in the same input window having similar flow sequences, but differing in flow sequences in the same prediction window. The second limitation is that the performance improvement brought about by the improvement of the architecture is increasingly diminished. Deep learning models represented by space-time diagram neural networks and transformers are widely studied for better capture of traffic flow space-time dependence. Researchers have designed complex graph roll-up networks, attention mechanisms, and other methods for traffic flow prediction. However, on the one hand, as models are stacked continuously, the structure of the models becomes more and more complex. On the other hand, the performance improvement caused by the architecture improvement is gradually weakened, and marginal utility is achieved. Therefore, to further enhance performance, there is a need to shift the focus of research from designing new architectures to designing more efficient characterization approaches. Disclosure of Invention Aiming at the defects or improvement demands of the prior art, the invention provides a self-adaptive space-time coding and mask pre-training enhanced traffic flow prediction method and device, which aim to fully utilize long-term traffic flow historical data, accurately predict future traffic flow and ensure intelligent traffic scheduling and safety management. The traffic prediction method consists of an upstream mask pre-trained self-encoder and a downstream spatio-temporal predictor. The mask pre-trained self-encoder is composed of an encoder and a decoder, both based on a transducer model. The partially masked long-term traffic is fed into an encoder and the masked portion is reconstructed by a decoder to provide a context representation for the spatio-temporal predictor that includes long-term dependencies. The space-time predictor consists of a temporal transducer module and a spatial transducer module. At the time transducer module, a multi-scale trend-aware self-attention mechanism is designed for capturing trend changes in traffic flow. And integrating the shortest path characteristics among the nodes into a self-attention mechanism in a space transducer module, thereby capturing the structural characteristics of the road network. To enhance the characterizability of the model, both the mask pre-training self-encoder and the spatio-temporal predictor use spatio-temporal adaptive coding to learn the inherent spatio-temporal correlation of traffic flow. In order to achieve the above object, according to one aspect of the present invention, there is provided a traffic flow prediction method with adaptive space-time coding and mask pre-training enhancement, comprising a pre-training stage and a prediction stage, wherein the pre-training stage comprises S1 for carrying out random masking on historical traffic flow data of not less than one day in a time dimension by adopting a block processing mode, S2 for inputting the masked data into a mask from an encoder for training, obtaining a pre-training time encoder specially extracting long-term time dependence under the condition of reducing the occupation of a display memory through the block processing and the time dimension mask, and the prediction stage comprises: s3, adding time, date, road network space and self-adaptive space-