CN-121747330-B - Highway traffic flow prediction method based on structured semantics and expert network

CN121747330BCN 121747330 BCN121747330 BCN 121747330BCN-121747330-B

Abstract

The invention relates to a highway traffic flow prediction method based on structural semantics and an expert network, which comprises the steps of firstly converting road condition images into low-bandwidth structural texts at the edge side of a highway and uploading the low-bandwidth structural texts to a cloud; the cloud end associates multisource heterogeneous data according to the structured text, then dynamically selects a general expert network according to text semantics through a gating routing network to conduct feature extraction, utilizes a global text and video image expert network to extract global text and video image features, conducts weighted fusion on various features, inputs the text semantics and the fused features into a prediction decoder to obtain a prediction result, and conducts joint loss training model, and finally conducts highway traffic flow prediction through the trained prediction model. The invention can reduce the data transmission bandwidth and realize the self-adaptive fusion of multi-source heterogeneous information, thereby effectively improving the accuracy and reliability of the prediction of the traffic flow of the expressway under the complex scene.

Inventors

LI LINFENG
LIU BOHAI
SUN YIXI
BAI YUHONG

Assignees

福建省高速公路联网运营有限公司

Dates

Publication Date: 20260512
Application Date: 20260226

Claims (9)

1. The highway traffic flow prediction method based on the structural semantics and the expert network is characterized by comprising the following steps of: step S1, acquiring road condition images at the edge side of a highway, generating a structured text describing road condition information through a pre-trained multi-mode model, and uploading the structured text to a cloud; s2, acquiring multi-source heterogeneous data of a structural text associated road section at a cloud end, and converting the multi-source heterogeneous data into a semantic vector sequence with unified feature dimensions; Step S3, dynamically selecting a general expert network required by the traffic flow prediction of the expressway through a gate control routing network based on a semantic vector sequence of the structured text, and inputting required multi-source heterogeneous data into each general expert network to obtain a semantic feature sequence of heterogeneous multi-source information; s4, carrying out weighted fusion on the semantic feature sequence, the global text feature and the global video image feature of the multi-source information to obtain a fused heterogeneous information feature vector sequence; S5, inputting the structural text semantic vector sequence and the fused heterogeneous information feature vector sequence into a prediction decoder for decoding to obtain a highway traffic flow prediction result, constructing total loss based on errors between the prediction result and the real traffic flow and auxiliary loss of a gate control routing network, and training the whole model to obtain a trained prediction model; S6, predicting the traffic flow of the expressway by using the trained prediction model to generate a corresponding prediction result; The step S1 specifically comprises the following steps: Step S11, capturing a frame of road condition image at intervals by using edge equipment with high-performance cameras and computing units, which is deployed on the expressway After normalization processing, normalized road condition images are obtained ; Wherein, the Is a normalized function; Step S12, normalized road condition images Inputting a pre-trained multimodal model deployed in a computing unit, generating a structured text describing traffic information And uploading the cloud to the cloud; Wherein, the Generating a multimodal model of structured text for the pre-trained road condition image based, And generating a prompt word for the preset guiding structured text.
2. The method for predicting the traffic flow of an expressway based on the structured semantics and the expert network according to claim 1, wherein said step S2 specifically comprises the steps of: s21, acquiring a multi-source heterogeneous data sequence of a structural text-related road section at the cloud end : , Wherein, the To obtain a function of the multi-source heterogeneous data sequence according to the road segment location information, To structure the location information of the text-associated road segment, The i-th multi-source heterogeneous data of the structural text associated road section is obtained, and L is the number of the multi-source heterogeneous data; step S22, converting the acquired multi-source heterogeneous data sequence into a semantic vector sequence set with unified feature dimensions Wherein Semantic vector sequences for the ith multi-source heterogeneous data; Wherein, the To convert text data into a function of a sequence of semantic vectors, To convert the picture data into a function of the semantic vector sequence, To convert video data into a function of a sequence of semantic vectors, The word embedding operation is represented by a word, Representing the position encoding of the input data, The picture data is represented as being partitioned into blocks, Indicating that the picture blocks are subjected to a convolution operation, Representing the extraction of key frames from video data, An mth word vector representing the sequence of multi-source heterogeneous data semantic vectors.
3. The method for predicting the traffic flow of an expressway based on the structured semantics and the expert network according to claim 1, wherein said step S3 specifically comprises the steps of: Step S31, generating structured text at the edge of the expressway Coding to obtain a structured text semantic vector sequence ; Wherein, the An nth word vector that is a sequence of structured text semantic vectors; step S32, the obtained structured text semantic vector sequence Ingress gating routing network Calculating indexes and weights of topK universal expert networks, extracting weights of the expert networks from text and video image features of two global fields of view, and gating auxiliary loss of a routing network; Wherein, the For gating an index sequence of topK universal expert networks selected by the routing network based on the sequence of structured text semantic vectors, For a selected weight sequence of topK universal expert networks, The weights of the expert network are extracted for the global text feature, Extracting the weight of the expert network for the global video image features, The method for calculating the auxiliary loss of the gating routing network comprises the following steps: Wherein, the For the call frequency of the jth general expert network, The average value of the routing probability of the jth general expert network in one training batch is taken as N, and the total number of the general expert networks is taken as N; step S33 based on the index sequence Screening out corresponding general expert network, and indexing the sequence Mapping to sequences Inputting needed multi-source heterogeneous data into topK general expert networks to perform feature extraction to obtain semantic features of heterogeneous multi-source information; Wherein, the Representing semantic feature sequences of the heterogeneous multi-source information extracted by topK general expert networks; representing semantic features of heterogeneous multi-source information extracted by a kth general expert network; Representing feature extraction through a kth general expert network; representing a set of semantic vector sequences from multi-source heterogeneous data A set of multi-source heterogeneous data semantic vector sequences required by the kth general expert network, ; All text data used by the universal expert network of the current route selection Input global text feature extraction expert network Extracting features to obtain global text features of the current route ; All video image data used by the universal expert network of the current route selection Input global video image feature extraction expert network Feature extraction is carried out to obtain global video image features of the current route ; 。
4. A method of highway traffic flow prediction based on structured semantics and expert networks as claimed in claim 3, wherein in step S33, each generic expert network Each layer comprises a layer normalization module, a multi-head self-attention module and a multi-layer perceptron module; Wherein, the Representing the input of the attention layer of layer c, Representing a first layer normalization module in the c-layer attention layer, MSA representing a multi-headed self-attention module in the c-layer attention layer, Representing input The middle characteristics output after the processing of the first layer normalization module and the multi-head self-attention module, Representing the output characteristics of the attention layer of layer c, The second normalization module in the attention layer of the c layer is represented, the MLP represents the multi-layer perceptron module in the attention layer of the c layer, and the general expert network is input After the data of the (3) are sequentially processed by the L 1 layers of attention layers, the final semantic features of the heterogeneous multi-source information are obtained ; Global text feature extraction expert network Contains L 2 layers of recursive evolution layers, each layer of recursive evolution layer being defined as: Wherein, the Is a time step Is used for the input features of the (c), Is a time step Is a function of the implicit memory tensor of (c), Is a time step Is a function of the implicit memory tensor of (c), The product of the Hadamard is represented, For the adaptive forgetting gating operator, for calculating a history information retention ratio under time-varying parameters, Injecting intensity operators for information for controlling the intensity of the current input feature writing memory tensor, For inputting projection operator, for reconstructing semantic features at the current moment , For a learnable time-scale projection weight, For the time-scale offset to be a time-scale offset, For the predefined structured state parameter(s), In order to smooth the nonlinear activation function, In order to input the projection weight, In order to input the bias voltage, In order to output the projection weights, For output bias, input of layer 1 recursive evolution layer For expert network Input of (a) First, the Output of layer recursive evolution layer For expert network Output of (2) ; Global video image feature extraction expert network The method comprises the steps of including an L 3 spline topological association layer, wherein each spline topological association layer adopts attention aggregation calculation based on a B spline basis function to perform feature extraction calculation: Wherein, the And The in-layer input and output characteristic tensors are respectively, To be mapped via non-linearity The generated query tensor, key tensor and value tensor, Is tensor Is used in the manufacture of a printed circuit board, Output feature vector for non-linear mapping The number of components of the composition, To the normalized input feature vector The number of components of the composition, For a non-linear mapping function on the connected edges, As the base function of the B-spline of the p-th order, For the control point coefficients learned during the web training process, represent the p-th order B-spline basis function to the normalized input feature vector The first component is calculated to obtain the output characteristic vector The control point coefficients on this calculated path of the individual components, i being the input features Index of the ith vector component of (j) is the output characteristic P is the index of the p-th B-spline basis function, Is spline grid granularity and is used for approximating a complex nonlinear time-space evolution rule in the traffic flow video by adjusting the control point coefficient, As a matrix of the basic weight parameters, Is that The function is activated and the function is activated, Layer normalization, input of the 1 st spline topology association layer For expert network Input of (a) First, the Output of individual spline topology association layers Is that 。
5. The method for predicting the traffic flow of an expressway based on the structured semantics and the expert network as claimed in claim 3, wherein said step S4 comprises the steps of: Step S41, fusing semantic features of the heterogeneous multi-source information extracted by topK general expert networks, global text features extracted by a global text feature extraction expert network and global video image features extracted by a global video image feature extraction expert network to obtain a feature vector sequence ; Wherein, the The weight of the kth general expert network; Step S42, merging the characteristic vector sequences Inputting the characteristic vector sequence S of the fused heterogeneous information for traffic flow prediction after nonlinear change of the feedforward neural network; Wherein, the For the first feed-forward neural network, Is that The function is activated and the function is activated, Is a second feedforward neural network.
6. The method for predicting traffic flow on highway based on structured semantics and expert network according to claim 3, wherein the step S5 specifically comprises the steps of: Step S51, the structured text semantic vector sequence And fusing heterogeneous information feature vector sequences Input predictive decoder Obtaining a final predictive vector sequence ; Wherein, the An nth word vector which is a predicted vector sequence; Predictive decoder From the following components The decoding layers are stacked, and each decoding layer comprises a mask multi-head self-attention module, a multi-head cross-attention module, a layer normalization module and a multi-layer perception module; Wherein, the For the input of the h decoding layer, A layer 1 normalization module for the h decoding layer, The result of the input of the h decoding layer normalized by the 1 st input layer normalization module, The 1 st key projection matrix for the h decoding layer, For the 1 st key feature matrix of the h decoding layer, For the 1 st value projection matrix of the h decoding layer, For the 1 st value feature matrix of the h decoding layer, The projection matrix for the 1 st query for the h decoding layer, Inquiring a feature matrix 1 for an H decoding layer, wherein H is the stacking layer number of the decoding layer; Wherein, the A masked multi-headed self-attention module for the h decoding layer, Masking multi-headed self-attention module output for h decoding layer and input for h decoding layer A result of performing residual connection; Wherein, the A layer 2 normalization module for the h decoding layer, The projection matrix for the 2 nd query for the h decoding layer, The 2 nd inquiry feature matrix of the h decoding layer is that S is a fused heterogeneous information feature vector sequence for traffic flow prediction, The matrix is projected for the 2 nd key of the h decoding layer, For the 2 nd key feature matrix of the h decoding layer, For the 2 nd value projection matrix of the h decoding layer, A 2 nd value feature matrix for the h decoding layer; Wherein, the A multi-headed cross-attention module for the h decoding layer, Multi-headed cross attention module output for h decoding layer and mask multi-headed self attention module output after residual connection A result of performing residual connection; Wherein, the A layer 3 normalization module for the h decoding layer, A multi-layer perception module for the h decoding layer, Outputting for the h decoding layer; Wherein, the For the final sequence of prediction vectors, For the last layer normalization module of the predictive decoder, For the last multi-layer perceptual module of the predictive decoder, H is the number of stacked predictive decoder layers for the output of the H-th decoding layer; Step S52, using the final predictive vector sequence Predicting the traffic flow to obtain a highway traffic flow prediction result; Wherein, the Is a multi-layer perceptron module, Y is a highway traffic flow prediction result, t is a predicted time step, The number of lanes; Step S53, calculating the total loss of the whole model ; Wherein, the For a mean square error loss of one training batch, For the real traffic flow of the ith sample in the batch, For highway traffic flow predicted by the ith sample in the batch, The number of samples in a training batch; step S54 total loss based on the whole model And (5) carrying out back propagation, updating model parameters and training to obtain a final prediction model.
7. The method for predicting the traffic flow of the highway based on the structural semantics and the expert network according to claim 1, wherein the step S6 specifically comprises the following steps: step S61, multi-source heterogeneous data of highway monitoring points are obtained at regular time and stored in a database; Step S62, capturing real-time road condition images at regular time at the edge side of the expressway, generating a structured text at the edge side, and uploading the structured text to the cloud; step S63, obtaining the latest multi-source heterogeneous data set from the database Wherein The latest L-th multi-source heterogeneous data of the monitoring point; Step S64, collecting the multi-source heterogeneous data And inputting the structured text into a trained prediction model to obtain a highway traffic flow prediction result 。
8. A computer device comprising at least one processor, at least one memory, and computer program instructions stored in the memory, which when executed by the processor, implement the method of any of claims 1-7.
9. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1-7.

Description

Highway traffic flow prediction method based on structured semantics and expert network Technical Field The invention relates to the technical field of traffic flow prediction and control, in particular to a highway traffic flow prediction method based on structural semantics and an expert network. Background The expressway traffic flow prediction is a key for realizing the intelligent control of expressway traffic, and accurate prediction is helpful for optimizing road network scheduling, relieving congestion and improving travel efficiency. The traditional prediction scheme mainly depends on two types of data, namely, full-volume video streams transmitted back in real time through a road side monitoring camera are analyzed at a cloud end, and cross section traffic flow data acquired by fixed sensors such as a microwave radar, a coil and the like. However, the prior art has the remarkable defects that 1, the full video backhaul causes huge pressure on communication bandwidth and cloud storage, and on the edge road section with poor network signals, the real-time performance and stability of data transmission are difficult to ensure, so that the timeliness of prediction is influenced. 2. The traditional sensor can only provide numerical information such as traffic flow, speed and the like, can not sense abundant visual and environmental semantic information such as 'road surface icing', 'traffic accident', 'vehicle queuing length', and the like, and causes insufficient sensing capability of the model on complex traffic conditions. 3. The multi-source data such as weather, holidays, POIs (points of interest), historical traffic and the like have huge differences in format, dimension and semantics, the existing model is difficult to perform effective deep semantic fusion, and further improvement of prediction accuracy is limited. Therefore, a new expressway traffic flow prediction method capable of reducing transmission load, fusing multi-source heterogeneous information, and making full use of environmental semantics is needed. Disclosure of Invention The invention aims to overcome the defects of the prior art and provide a highway traffic flow prediction method based on structural semantics and an expert network, which converts high-bandwidth video into low-bandwidth structural text at the edge side, and a hybrid expert network is introduced into the cloud for dynamic routing and multi-mode feature extraction and fusion, so that high-precision and self-adaptive highway traffic flow prediction under the low bandwidth requirement is realized. In order to achieve the purpose, the invention adopts the following technical scheme that the expressway traffic flow prediction method based on the structural semantics and the expert network comprises the following steps: step S1, acquiring road condition images at the edge side of a highway, generating a structured text describing road condition information through a pre-trained multi-mode model, and uploading the structured text to a cloud; s2, acquiring multi-source heterogeneous data of a structural text associated road section at a cloud end, and converting the multi-source heterogeneous data into a semantic vector sequence with unified feature dimensions; Step S3, dynamically selecting a general expert network required by the traffic flow prediction of the expressway through a gate control routing network based on a semantic vector sequence of the structured text, and inputting required multi-source heterogeneous data into each general expert network to obtain a semantic feature sequence of heterogeneous multi-source information; s4, carrying out weighted fusion on the semantic feature sequence, the global text feature and the global video image feature of the multi-source information to obtain a fused heterogeneous information feature vector sequence; S5, inputting the structural text semantic vector sequence and the fused heterogeneous information feature vector sequence into a prediction decoder for decoding to obtain a highway traffic flow prediction result, constructing total loss based on errors between the prediction result and the real traffic flow and auxiliary loss of a gate control routing network, and training the whole model to obtain a trained prediction model; and S6, predicting the traffic flow of the expressway by using the trained prediction model to generate a corresponding prediction result. Further, the step S1 specifically includes the following steps: Step S11, capturing a frame of road condition image at intervals by using edge equipment with high-performance cameras and computing units, which is deployed on the expressway After normalization processing, normalized road condition images are obtained; Wherein, the Is a normalized function; Step S12, normalized road condition images Inputting a pre-trained multimodal model deployed in a computing unit, generating a structured text describing traffic informationAnd uploading the cloud to the