CN-122020523-A - Pavement maintenance decision method based on text numerical value cross modal fusion
Abstract
The invention discloses a pavement maintenance decision method based on text numerical value cross modal fusion, which comprises the following steps of 1, collecting multi-source data, 2, constructing a task-specific corpus, aligning with the multi-source data, 3, extracting pavement numerical value space-time evolution characteristics, 4, extracting text semantic perception characteristics, 5, fusing knowledge injection and cross modal gating, and 6, decomposing deep time sequence to cooperatively predict multi-indexes. According to the invention, by constructing a task-specific semantic corpus, serial mixed space-time channels and multi-scale semantic perception channels and introducing a dynamic knowledge injection gating mechanism, accurate collaborative prediction of three key performance indexes PCI, RDI, RQI of a road surface is realized, and a customized maintenance suggestion is generated by combining road segment clustering. The method is particularly suitable for the field of highway infrastructure management, and is especially suitable for the scene of integrating multi-source heterogeneous data, improving the pavement performance prediction precision and improving the scientificity of maintenance decisions.
Inventors
- PEI LILI
- HAO JUN
- WANG YANSONG
- LI LE
- ZHU YANGYANG
- GUO WEI
- LI NINGXIN
Assignees
- 长安大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260116
Claims (9)
- 1. A pavement maintenance decision method based on text numerical value cross modal fusion is characterized by comprising the following steps: step 1, multisource data collection Collecting data from highway management departments, meteorological monitoring institutions, traffic control systems and industry technical documents to form an original multi-source data set; step 2, constructing a task-specific corpus and aligning the task-specific corpus with data (2A) Preprocessing the pavement related data, namely filling the missing value in the numerical data collected in the step 1 by adopting an interpolation method to ensure the integrity, and converting the classification label into the numerical characteristic for model training by adopting independent heat coding in the non-numerical data to generate an initial numerical characteristic set; (2b) Constructing a task-specific semantic corpus and space-time alignment, namely screening and cleaning by utilizing keywords in the road engineering field based on the text data collected in the step 1, and constructing the task-specific semantic corpus containing policy regulations, news reports and public opinion; (2c) Space-time alignment, namely establishing an alignment mechanism based on a space-time sliding window, strictly mapping the text release time in the corpus obtained in the step (2 b) to a closed interval [ t-N, t ] before a time step to be predicted, removing text interference of a future time step, and generating a space-time aligned text set; (2d) Feature vectorization, namely training the space-time aligned text set obtained in the step (2 b) through a distributed semantic embedding algorithm to generate a sentence vector matrix; (2e) Performing principal component analysis, namely performing dimension reduction on the initial numerical value feature set of the step (2 a) and the sentence vector matrix generated in the step (2 d) by adopting principal component analysis, and finally outputting a preprocessed pavement numerical value feature matrix and a preprocessed text semantic feature matrix; step 3, extracting the road surface related data characteristics (3A) Inputting the preprocessed pavement numerical value feature matrix subjected to dimension reduction in the step (2 e) into a local time sequence sensing module, and capturing complex features of time distribution through two-wheel expansion causal convolution; (3b) Performing nonlinear transformation on the features subjected to convolution operation in the step (3 a) through a ReLU full-connection layer, and reducing the risk of overfitting through a Dropout layer; (3c) Adding the input features of the local time sequence sensing module in (3 a) and the features processed by the Dropout in (3 b) to generate a local time sequence sensing module output feature vector; (3d) Inputting the feature vector generated in the step (3 c) into a global association module, generating a query matrix, a key matrix and a value matrix, splicing all attention head outputs, and linearly transforming through an output weight matrix to obtain multi-head attention features; (3e) Performing residual connection and layer normalization on the multi-head attention feature of the step (3 d) and the local space-time feature of the step (3 c), and finally outputting a global space-time evolution feature sequence of the road surface monitoring data through feedforward neural network (FFN) transformation; step 4, text semantic feature extraction (4A) Inputting the text semantic feature matrix subjected to dimension reduction in the step (2 e) into a local time sequence perception module model; (4b) Performing local convolution on the text embedded matrix input by the step (4 a) by adopting a plurality of groups of convolution kernels with different window sizes so as to extract semantic features; (4c) Performing nonlinear transformation on the feature map output by each convolution kernel of (4 b) through a ReLU function; (4d) The attention introducing mechanism calculates a feature correlation score; (4e) Calculating through a Softmax function based on the attention weight; (4f) Calculating an attention output for the attention weight of (4 e); (4g) Performing maximum pooling operation on the feature map output of each attention layer, and selecting the maximum value in each feature map as a pooling result; (4h) Splicing feature vectors generated by different convolution kernels to form a final text semantic feature vector; Step 5, cross mode fusion (5A) Performing end-to-end splicing on the road surface monitoring data global time-space evolution feature sequence obtained in the step 3 and the text semantic feature vector obtained in the step 4 to form an initial cross-modal feature set; (5b) Performing linear projection on the initial cross-modal feature set to ensure dimensional consistency; (5c) Introducing a dynamic gating mechanism, and dynamically adjusting the contribution degree of each mode according to the input characteristics to calculate a gating coefficient; (5d) Performing weighted correction on the text features based on the gating coefficient of the step (5 c), and injecting the text features into a road surface feature space to generate a knowledge enhancement type fusion feature set; step 6, depth time sequence decomposition cooperated with multi-index prediction (6A) Inputting the knowledge enhancement type fusion feature set obtained in the step 5 into a depth time sequence decomposition prediction network, dynamically decoupling the features by using a cascading type sequence decomposition unit, and reflecting a trend component of a long-term aging rule and a seasonal component reflecting short-term environmental fluctuation; (6b) Introducing a periodic dependency capture mechanism, calculating delay similarity of the sequence in a time dimension, and aggregating similar subsequences; (6c) And carrying out self-adaptive reconstruction on the trend component and the season component by using a generating decoder, and synchronously outputting three key performance indexes of the road surface of a future time step through a multi-target output layer.
- 2. The pavement maintenance decision method based on text numerical cross modal fusion according to claim 1 is characterized in that the method in the step 1 comprises three types of core data, namely pavement related numerical data, pavement performance indexes, climate factors and traffic flow data, pavement related non-numerical data, basic pavement information and maintenance history records, external text knowledge, maintenance standard text, expert experience records, policy files and disease inspection reports.
- 3. The pavement maintenance decision method based on text numerical cross modal fusion according to claim 1, wherein in the step (3 a), the formula is as follows: Wherein, the A convolution kernel of size k is represented, Is a historical time point feature corresponding to the current time point t in the input feature sequence, Is an expansion factor The model can cover longer historical time points, and further more complex time characteristics can be learned.
- 4. The pavement maintenance decision method based on text numerical cross modal fusion according to claim 1, wherein in the step (3 b), the formula is as follows: wherein, represent A nonlinear activation function of the fully-connected layer, Is an input feature that is used to determine the input, Is the output feature after the residual connection.
- 5. The pavement maintenance decision method based on text numerical cross-modal fusion according to claim 1, wherein in the step (3 d), the step (3 c) is performed Input global association module to generate query matrix Key matrix Matrix of values The single-head attention is calculated as All attention head outputs are spliced and output weight matrix is passed Linear transformation to obtain multi-head attention features 。
- 6. The pavement maintenance decision method based on text numerical cross modal fusion according to claim 1, wherein in the step (3 e), for multiple head attention features And (3 c) local timing aware module output features Performing residual connection and layer normalization to obtain To (3) pair Performing a nonlinear transformation by a feed forward neural network FFN: after the secondary layer normalization, the global feature sequence of the pavement monitoring data is obtained 。
- 7. The pavement maintenance decision method based on text numerical cross modal fusion according to claim 1, wherein in the step (4 b), a plurality of groups of convolution kernels with different window sizes are adopted to embed a matrix into the text input in the step (4 a) Performing local convolution to extract semantic features, the formula: ; in the step (4 c), for the feature map output by each convolution kernel of (4 b), nonlinear transformation is performed by a ReLU function, where the formula is: ; in the step (4 d), the attention introducing mechanism calculates a feature correlation score The formula is: , wherein, Respectively a query and a representation of a key, Is a matrix of weights that are to be used, Is a bias vector; in the step (4 e), the attention weight is calculated by a Softmax function, and the formula is: ; In the step (4 f), the attention weight calculation attention output for (4 e) is: ; In the step (4 g), a maximum pooling operation is performed on the feature map output of each attention layer, and the maximum value P in each feature map is selected as a pooling result, where the formula is as follows: , wherein, Is the first The feature map generated by the convolution kernel, Is a pooled feature; In the step (4 h), the feature vectors generated by different convolution kernels are spliced together to form a final text feature vector as follows: 。
- 8. The pavement maintenance decision method based on text numerical cross modal fusion according to claim 1, wherein in the step (5 a), the pavement related features obtained in the step 3 are And the text characteristics obtained in the step4 Performing end-to-end splicing to form an initial cross-modal feature set, wherein the formula is as follows: ; in the step (5 b), linear projection is performed on the initial cross-modal feature set to ensure dimension consistency, where the formula is: , wherein, Is a linear transformation matrix; in the step (5 c), a dynamic gating mechanism is introduced, and gating coefficients are calculated by dynamically adjusting the contribution degree of each mode according to input characteristics, wherein the formula is as follows: , wherein, Is a linear transformation matrix which is a matrix of the linear transformation, Is a bias term that is used to determine, Is a sigmoid function; In the step (5 d), cross-modal feature fusion is completed based on the gating coefficient of (5 c), and the formula is as follows: , wherein, Representing an element-by-element multiplication, Is the final fused multi-source feature set.
- 9. The pavement maintenance decision method based on text numerical cross modal fusion according to claim 1, wherein in the step (6 a), the knowledge enhancement type fusion feature set obtained in the step (5) is input into a deep time sequence decomposition prediction network, and the feature is dynamically decoupled into a trend component reflecting a long-term aging rule and a season component reflecting short-term environmental fluctuation by using a cascading type sequence decomposition unit, wherein the formula is as follows: ; in the step (6 b), a periodic dependency capture mechanism is introduced, the delay similarity of the sequence in the time dimension is calculated, and the similarity subsequences are aggregated based on the delay similarity, wherein the formula is as follows Wherein, the Indicating a time delay of The autocorrelation coefficient at the time, L is the sequence length In the step (6 c), the self-adaptive reconstruction is performed on the trend component and the season component by using a generating decoder, and three key performance indexes of the road surface of the future time step are synchronously output through a multi-target output layer, wherein the formula is as follows: , wherein, As the final output pavement performance index predictive value matrix, And The reconstructed trend features and the reconstructed season features of the decoder output are respectively, And The weight matrix and the bias term of the full-connection output layer are respectively.
Description
Pavement maintenance decision method based on text numerical value cross modal fusion Technical Field The invention relates to the technical fields of artificial intelligence and traffic engineering, in particular to a pavement maintenance decision method based on text numerical value cross modal fusion. Background Along with the extension of service life of traffic infrastructures such as expressways, urban arterial roads and the like, the pavement performance of asphalt pavement is gradually attenuated under the comprehensive actions of multiple factors such as running load, natural environment and material aging, and typical diseases such as cracks, pits, ruts and looseness are easy to occur. The road surface damage not only directly reduces the travelling comfort and stability of the vehicle and increases the fuel consumption and the vehicle abrasion, but also can lead to the reduction of the bearing capacity of the road surface structure, and causes traffic accidents such as bumping and out-of-control of the vehicle, burst of tires and the like, thereby forming serious threat to the road traffic safety. The current pavement maintenance decision system is long-term dependent on quantitative analysis of monitoring class numerical data. While some leading edge studies began to attempt to focus on unstructured information, there is a general lack of systematic methods of constructing a specialized corpus in the field of pavement maintenance. In fact, unstructured information dispersed in maintenance specifications, expert experience, policy documents, and inspection reports is rich in qualitative descriptions, engineering practice logic, and disease causative mechanisms. However, limited by the technical bottlenecks of data engineering, existing processing approaches often consider these text resources as simple labels or notes, failing to build a domain-specific semantic feature space. This discrete processing approach does not form a structured knowledge system, resulting in a difficult precise mapping of textual information and quantitative monitoring data in the space-time dimension. The lack of support of a special corpus and a depth alignment mechanism directly leads to insensitivity of a maintenance decision model to sudden and policy factors, and causes the lack of multidimensional evidence for disease assessment. In addition, the existing pavement performance prediction methods focus on capturing decay rules of a single index through a statistical model or a machine learning algorithm. In dealing with heterogeneous data fusion, there is a lack of strict space-time alignment and dynamic interaction mechanisms. The universal text data from the internet often contains a large amount of noise, and if the universal text data is directly spliced with numerical data without strict constraint of screening of a special corpus and space-time window, the prediction accuracy of high-frequency noise interference can be introduced. The prior art lacks a systematic framework capable of simultaneously solving the problems of special corpus construction, heterogeneous data space-time alignment and multi-index collaborative prediction, and is difficult to meet the urgent requirements of modern intelligent traffic operation and maintenance on accurate and scientific decisions. Therefore, a pavement maintenance decision technology capable of effectively fusing text-type and numerical multi-source data is needed, comprehensiveness and accuracy of pavement performance assessment are improved through deep correlation and information complementation of cross-mode data, and scientific and reliable data support is provided for maintenance decision making. The existing pavement performance prediction and maintenance decision method is mostly driven and developed around numerical data, and the core focuses on capturing pavement performance attenuation rules through statistical analysis, machine learning or traditional deep learning methods, wherein the basic thought is that firstly, pavement related numerical data (such as PCI, RDI, RQI and other performance indexes) are collected, and preprocessing is completed through missing value filling, classified data encoding and dimension reduction; and finally, based on the prediction result and combining with maintenance standards, outputting simple maintenance decision suggestions. Zhu et al developed a Grey-Markov fusion model in literature "Research on performance prediction of highway asphalt pavement based on Grey–Markov model. Transportation Research Record, 2022, 2676 (4): 194-209.", and by combining gray prediction with Markov chain correction, the problem of strong dependence of a single gray model on data homogeneity was solved, and the PCI index prediction of asphalt pavement was realized with an error reduced to 0.67%, which is superior to the single gray model. Tang et al in literature "Monthly attenuation prediction for asphalt pavement performance by using GM (1, 1)