CN-121999615-A - Traffic flow prediction method based on large language model

CN121999615ACN 121999615 ACN121999615 ACN 121999615ACN-121999615-A

Abstract

The invention discloses a traffic flow prediction method based on a large language model, which comprises the steps of data preparation, road network diagram structure construction, traffic sensor data collection, traffic prediction model construction, data formatting into a mixed format of natural language and structure combination, spatial feature extraction through LLM, reprogramming after time sequence data formatting, spatial feature fusion with time sequence data time-space feature of reprogramming, light fine tuning and forward propagation of the language model through LoRA, suffix part rejection, output representation acquisition, flattening operation execution of the output representation and prediction result acquisition through a linear projection layer. The invention is convenient to use LLM for traffic flow prediction under the condition that the trunk language model is kept intact, and obviously reduces the complexity and the parameter number of model training while improving the prediction accuracy.

Inventors

TIAN ZHAO
LIU WEI
SHE WEI
SONG XUAN
Zai Guangjun
Shao Kaikai
ZHOU ZHENG
YANG YANFANG
ZHANG BEIBEI
Mao Yiqiao
WANG MENG
LI YINGHAO
WANG YOUWEI

Assignees

郑州大学

Dates

Publication Date: 20260508
Application Date: 20260311

Claims (10)

1. A traffic flow prediction method based on a large language model, comprising: Preparing data, constructing a road network diagram structure, and collecting traffic sensor data; model construction, constructing a traffic prediction model, and inputting the traffic prediction model into the past The historical time sequence of each time step is output as the future with the adjacency matrix of the graph structure Predicting time sequences of the time steps; Data formatting, namely converting a numerical type adjacency matrix of a graph structure into a mixed format of natural language and structure combination; space feature extraction, designing a prompt word template, carrying out multi-round dialogue with the LLM, guiding the LLM to carry out static topological structure analysis, and carrying out inter-node dynamic influence analysis by combining time sequence flow data of nodes; Time sequence data formatting, dividing time sequence data into The independent time sequence blocks Patch are subjected to independent normalization processing through reversible instance normalization; Reprogramming, namely dividing the Patch into a plurality of continuous overlapped or non-overlapped fragments, projecting the time sequence fragments from an original data space to a high-dimensional semantic space by using a learnable linear transformation as an embedder, and reprogramming by using word embedding pre-trained in a backbone network; And (3) carrying out space-time feature fusion, namely taking the extracted space features as prompt word prefixes, packaging and inputting frozen LLM with the time sequence data after reprogramming, carrying out forward propagation, discarding a suffix part, acquiring an output representation, carrying out flattening operation on the output representation, and obtaining a prediction result through a linear projection layer.
2. The traffic flow prediction method according to claim 1, further comprising LORA fine tuning, wherein the LORA fine tuning comprises: Selecting dense layers needing fine tuning in a pre-training model, and carrying out original pre-training on a weight matrix Setting the matrix into a frozen state, introducing a low-rank structure into each frozen weight matrix to simulate the updating quantity of the matrix, and constructing two trainable matrices And ( Rank) parameterized weight update matrix , , wherein, Initializing by adopting random Gaussian distribution with the mean value of 0 and the standard deviation of sigma, Then it is initialized to a zero matrix, To meet the requirements of Is a super parameter of (2); training initiation time , When fine tuning, the layer containing the LORA adapter, the forward propagation process is superposition of the original output and the low-rank update output Wherein, the Is the input vector for that layer.
3. The traffic flow prediction method based on large language model as set forth in claim 1, wherein the graph structure is Wherein, the Is composed of A limited set of individual traffic sensor nodes; to connect a finite set of inter-node edges, Is that Is used for the adjacent matrix of (a), Representation of At the time of In (2), wherein Representing a single sensor At the time of And (5) collecting characteristic data.
4. The traffic flow prediction method according to claim 3, wherein the model is constructed as Input is A kind of electronic device Adjacent matrix of (a) The output is , Representation of Characterised by the first Predicted values of each future time step, the prediction process of the model is expressed as 。
5. The traffic flow prediction method based on the large language model of claim 1, wherein the static topology analysis comprises identifying a direct neighboring node and an indirect neighboring node corresponding to each node based on the formatted adjacency matrix, and constructing a static topology relation description of each node and the neighboring nodes.
6. The traffic flow prediction method based on the large language model of claim 5, wherein the analysis of dynamic influence among nodes comprises the steps of combining time sequence traffic data of nodes to quantitatively analyze traffic propagation modes, hysteresis effects and correlation between a target node and adjacent nodes of each order.
7. The traffic flow prediction method according to claim 1, wherein the reprogramming further comprises embedding words from words by a linear detection method Middle screening to build small-scale text prototype set Wherein , wherein, Is the vocabulary size.
8. The traffic flow prediction method according to claim 7, further comprising calculating a multi-head cross attention layer, wherein the attention head is Defining a query matrix Key matrix Matrix of values Wherein ; Reprogramming the spatiotemporal sequence in each attention header: By in each head Polymerization is carried out to obtain Its linear projection to align the hidden dimension with the backbone model, resulting in a time series reprogrammed representation 。
9. The traffic flow prediction method based on a large language model of claim 1 wherein the hint word prefix further comprises a data set context, task instructions and time series data statistics, wherein the data set context provides basic background information about the input time series for the LLM, the task instructions serve as key guidance for segment embedding conversion of the LLM for specific tasks, and the time series data statistics enhance the expression of the input time series.
10. The traffic flow prediction method based on large language model as set forth in claim 2, wherein the matrix Comprising scaling factors , wherein, Is a constant superparameter.

Description

Traffic flow prediction method based on large language model Technical Field The invention belongs to the technical field of traffic flow prediction models, and relates to a traffic flow prediction method based on a large language model. Background In the age background of rapid development of artificial intelligence technology, intelligent transportation systems (IntelligentTransportationSystems, ITS) have become key solutions to the urban traffic challenges. As a core component of an intelligent traffic system, urban road traffic flow prediction has an important role in traffic theory research. Traditional time series models, such as an autoregressive integral moving average (ARIMA) model and a Kalman filter, have difficulty in fully describing complex nonlinear features and space-time dependence in traffic data. Typical deep learning methods capture spatial dependence using Convolutional Neural Networks (CNNs) and utilize cyclic neural network (RNNs) processing time dynamics, but CNNs and RNNs still face significant challenges in efficiently modeling their time-space dependence due to the non-euclidean spatial structure and complex periodic patterns of variation of traffic networks. Models based on graph rolling networks (GraphConvolutionalNetworks, GCN) often suffer from overcomplete, limiting the capture of global spatial features. This limitation motivates the academic community to turn to attention mechanism-based models, but as the predicted scene becomes more complex, the existing model structure becomes increasingly complex, the number of parameters and training costs rise dramatically, resulting in a model facing serious challenges in terms of scalability and computational efficiency. In addition, most models are still limited to specific task training, and the phenomenon of overfitting is easy to occur, so that the generalization capability of the model is restricted from being further improved. In recent years, a large language model (LargeLanguageModels, LLMs) has made remarkable progress in the fields of Computer Vision (CV), natural Language Processing (NLP) and the like, and compared with the complex structural design and high training cost in the existing traffic prediction model, the large language model mainly depends on parameter expansion and large-scale pretraining realization capability evolution, and the basic model structure of the large language model is kept relatively stable. But large language models were originally designed for natural language processing tasks, which are based on Token-based embedding mechanisms that do not directly adapt to time-series data formats common in the traffic arts, and which do not naturally cover the understanding and reasoning capabilities of time-series patterns during their pre-training. On the other hand, current traffic prediction methods based on large language models focus on time-dimensional modeling of data, often ignoring spatial dependencies thereof. In a real traffic scene, spatial association is of great importance, if traffic flow of a certain road section suddenly increases or decreases, similar or complementary flow changes often occur in adjacent road sections, and the spatial interaction effect highlights the necessity of simultaneously integrating space-time two-dimensional information in predictive modeling. In traffic prediction, strong correlation exists between space variables, and research has proved that space dimension has important influence on prediction performance, but the prior art has not well solved how to effectively fuse time and space features extracted from traffic flow so as to adapt to an input mode of a large language model, thereby fully exciting the reasoning and understanding capability of the space-time joint features. Disclosure of Invention Aiming at the problems, the invention provides a traffic flow prediction method based on a large language model, which well solves the problems in the prior art. In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a traffic flow prediction method based on a large language model comprises the following steps: Preparing data, constructing a road network diagram structure, and collecting traffic sensor data; model construction, constructing a traffic prediction model, and inputting the traffic prediction model into the past The historical time sequence of each time step is output as the future with the adjacency matrix of the graph structurePredicting time sequences of the time steps; Data formatting, namely converting a numerical type adjacency matrix of a graph structure into a mixed format of natural language and structure combination; Space feature extraction, namely designing a prompt word template, carrying out dialogue with the LLM, guiding the LLM to carry out static topological structure analysis, and carrying out inter-node dynamic influence analysis by combining time sequence flow data of nodes; Time sequence