CN-122020351-A - Road network disease image recognition method and system based on deep learning
Abstract
The invention provides a road network disease image recognition method and system based on deep learning, which are characterized in that continuous image sequences of road surfaces and vehicle-mounted inertial navigation data in corresponding time periods are collected, the continuous image sequences and the vehicle-mounted inertial navigation data in corresponding time periods are integrated into a joint collection data set through timestamp synchronous processing, spectral-geometric double-domain decoupling processing is carried out on image frames in the continuous image sequences and the vehicle-mounted inertial navigation data to obtain spectral feature components and geometric structure components of each image frame, association mapping processing is carried out on the spectral feature components and the geometric structure components to generate a cross-domain feature matrix, the cross-domain feature matrix is input into a pre-trained topological perception graph neural network, graph structural modeling is carried out by combining road network node connectivity information, a topological embedding vector is generated through iterative updating of node association strength, vibration spectrum data in navigation data records corresponding to the topological embedding vector in the joint collection data set is extracted, cross-modal alignment processing is carried out on the vibration spectrum feature components and the vibration spectrum feature components to generate an alignment enhancement feature vector, and a road network disease recognition result is generated. The method can improve the accuracy of identifying the road network diseases.
Inventors
- FENG QUAN
- Leng Zhujun
- LI YIYING
- CHEN JUN
Assignees
- 贵州汇联通支付服务有限公司
- 贵州黔通智联科技股份有限公司
Dates
- Publication Date
- 20260512
- Application Date
- 20251208
Claims (10)
- 1. The road network disease image identification method based on deep learning is characterized by comprising the following steps of: Collecting continuous image sequences of the road surface and vehicle-mounted inertial navigation data in a corresponding period, and integrating the continuous image sequences and the vehicle-mounted inertial navigation data into a combined collection data set through timestamp synchronous processing, wherein each data unit in the combined collection data set comprises image frames at the same collection time and corresponding navigation data records; Performing spectrum-geometry dual-domain decoupling treatment on the image frames in the combined acquisition data set to obtain a spectrum characteristic component and a geometry component of each image frame, and performing association mapping treatment on the spectrum characteristic component and the geometry component to generate a cross-domain characteristic matrix; inputting the cross-domain feature matrix into a pre-trained topological perception graph neural network, carrying out graph structure modeling processing on the cross-domain feature matrix by combining with road network node connectivity information, and generating a topological embedded vector by iteratively updating node association strength; Extracting vibration spectrum data in navigation data records corresponding to the topology embedded vectors in the combined acquisition data set, performing cross-modal alignment processing on the topology embedded vectors and the vibration spectrum data, and generating alignment enhancement feature vectors; and carrying out joint identification processing on the disease type and the severity degree on the alignment enhancement feature vector to generate a road network disease identification result.
- 2. The method of claim 1, wherein performing spectral-geometric dual-domain decoupling on the image frames in the joint acquisition data set to obtain a spectral feature component and a geometric structure component of each image frame, performing association mapping on the spectral feature component and the geometric structure component, and generating a cross-domain feature matrix, including: Performing color space conversion processing on the image frame, converting the image frame from an original color space to a target color space, separating to obtain a plurality of spectrum channel components, performing spatial noise suppression processing on each spectrum channel component, and generating a spectrum channel set after noise reduction; performing edge contour extraction processing on the image frame, identifying structural edge lines of the road surface in the image frame, calculating curvature change rate and direction gradient distribution of the structural edge lines, and generating a geometric structure description vector; inputting the noise-reduced spectrum channel set into a spectrum characteristic encoder, and performing nonlinear transformation processing on each spectrum channel component through a multi-layer perceptron to generate spectrum characteristic components, wherein the spectrum characteristic components comprise spectrum response intensity distribution of each channel; Performing structural feature coding processing on the geometric structure description vector, and performing space dimension feature extraction on the geometric structure description vector based on a convolutional neural network to generate geometric structure components, wherein the geometric structure components comprise the space topological relation of edge lines; Constructing a spectrum-geometry association mapping matrix, matching and aligning the spectrum characteristic components with characteristic dimensions of the geometry components, performing association mapping processing on the spectrum characteristic components and the geometry components through characteristic connection operation, and generating a cross-domain characteristic matrix, wherein a row dimension of the cross-domain characteristic matrix corresponds to a characteristic channel of the spectrum characteristic components, and a column dimension corresponds to a spatial position of the geometry components.
- 3. The method of claim 2, wherein the spatially noise suppressing processing is performed on each spectral channel component to generate a set of denoised spectral channels, comprising: performing local variance calculation on each spectrum channel component, traversing each pixel point of the spectrum channel component by using a preset sliding window, calculating variance values of pixel values in the window, and generating a variance distribution diagram; determining a segmentation threshold value of a noise region and a non-noise region according to the variance distribution diagram, marking a region with a variance value larger than the segmentation threshold value as a noise candidate region, and marking a region with a variance value smaller than or equal to the segmentation threshold value as a non-noise region; performing self-adaptive median filtering processing on the noise candidate region, adjusting the size of a filtering window according to the neighborhood pixel value distribution of pixel points in the noise candidate region, sequencing the pixel values in the window, and taking a median to replace the original pixel value; conducting guide filtering processing on the non-noise area, taking an original spectrum channel component as a guide graph, and conducting edge-preserving smoothing processing on pixel values of the non-noise area; And carrying out region merging processing on the noise candidate region subjected to the self-adaptive median filtering processing and the non-noise region subjected to the guided filtering processing to generate a noise-reduced spectrum channel component, and combining all spectrum channel components into a noise-reduced spectrum channel set.
- 4. The method of claim 2, wherein the calculating the curvature change rate and directional gradient distribution of the structural edge line to generate a geometric description vector comprises: Sampling point extraction processing is carried out on the identified edge line of the road surface structure, sampling point coordinates are acquired at equal interval distances along the edge line, and a sampling point sequence is generated, wherein the sampling point sequence comprises coordinate point sets which are continuously distributed on the edge line; Calculating tangential direction vectors of adjacent sampling points in the sampling point sequence, deriving the tangential direction vector of each sampling point to obtain a curvature vector, wherein the modulo length of the curvature vector represents the curvature of the sampling point, and the direction represents the curvature change direction; Calculating the module length change rate of the curvature vector, and carrying out differential operation on the curvature module length of the continuous sampling points to obtain the curvature change rate, wherein the curvature change rate is used for representing the change intensity of the bending degree of the edge line; Performing gradient calculation on the structural edge line in the horizontal direction and the vertical direction on the gray level image of the image frame to generate a gradient map, wherein the gradient map comprises gradient amplitude values and gradient directions of each pixel point; And extracting gradient amplitude values and gradient directions of positions of edge lines of the structure from the directional gradient diagram, constructing a directional gradient distribution matrix, splicing the curvature change rate and the directional gradient distribution matrix according to row vectors, and generating a geometric structure description vector, wherein each element of the geometric structure description vector corresponds to one geometric characteristic parameter of the edge line.
- 5. The method of claim 1, wherein inputting the cross-domain feature matrix into a pre-trained topology aware graph neural network, performing graph structure modeling processing on the cross-domain feature matrix in combination with road network node connectivity information, and generating a topology embedding vector by iteratively updating node association strength comprises: Extracting a feature node set from the cross-domain feature matrix, taking each row of feature vector of the cross-domain feature matrix as an initial node feature of the graph neural network, and generating a node feature matrix, wherein the row dimension of the node feature matrix corresponds to the number of nodes, and the column dimension corresponds to the feature dimension; Acquiring road network node connectivity information, wherein the road network node connectivity information comprises the connection relation of each road section in a road network and the space distance between the road sections, and constructing an initial adjacency matrix based on the space distance, wherein the element value of the initial adjacency matrix represents the initial connection strength between the corresponding nodes; inputting the node characteristic matrix and the initial adjacent matrix into a graph construction layer of a topological perception graph neural network to construct a road network disease characteristic graph, wherein nodes of the road network disease characteristic graph are characteristic node sets, and edges are node connection relations determined based on the initial adjacent matrix; Invoking a graph volume lamination layer of the topology perception graph neural network to perform graph feature aggregation processing on the road network disease feature graph, and aggregating the neighborhood node features of each node to the current node through multiplication operation of an adjacent matrix and a node feature matrix to generate an aggregation feature matrix; calculating node association strength update values based on the aggregation feature matrix, determining association strength adjustment coefficients by comparing the similarity of node features before and after aggregation, and performing iterative update processing on the initial adjacent matrix based on the association strength adjustment coefficients to generate an updated adjacent matrix; and repeating the graph feature aggregation processing and the adjacent matrix updating processing for preset times until the node association strength is converged, and taking the final node feature matrix as a topology embedded vector, wherein the topology embedded vector comprises dynamic association relation information among nodes.
- 6. The method of claim 5, wherein calculating node association strength update values based on the aggregate feature matrix, determining an association strength adjustment coefficient by comparing similarities of node features before and after aggregation, iteratively updating an initial adjacency matrix based on the association strength adjustment coefficient, and generating an updated adjacency matrix, comprises: performing similarity calculation on the aggregate feature matrix and the node feature matrix before updating, and calculating a similarity value of a feature vector of the node in the aggregate feature matrix and a feature vector in the node feature matrix before updating according to each node to generate a similarity matrix; mapping each element value in the similarity matrix to a preset interval, and performing nonlinear transformation on the similarity values to generate a correlation strength adjustment coefficient; Multiplying the correlation strength adjustment coefficient by a corresponding element of an initial adjacent matrix to obtain an initial adjacent matrix, wherein the element value of the initial adjacent matrix is the product of the initial connection strength and the correlation strength adjustment coefficient; Performing row normalization processing on the preliminary adjustment adjacency matrix, calculating the sum of elements in each row, dividing the element value in each row by the sum of elements in the row to make the sum of elements in each row be 1, and generating a probabilistic adjacency matrix; Constructing a correlation strength regularization term, calculating a regularization loss value based on the Frobenius norm of the probability adjacency matrix, and adding the regularization loss value into a correlation strength updating objective function; And minimizing the association strength updating objective function through a gradient descent algorithm, iteratively adjusting element values of the probabilistic adjacent matrix until the regularization loss value is smaller than a preset threshold value, and taking the probabilistic adjacent matrix at the moment as an updated adjacent matrix.
- 7. The method of claim 5, wherein invoking the graph roll stacking layer of the topology aware graph neural network to perform graph feature aggregation processing on the road network impairment feature graph, aggregating neighborhood node features of each node to a current node through multiplication of an adjacency matrix and a node feature matrix, and generating an aggregate feature matrix comprises: Inputting a node characteristic matrix of the road network disease characteristic diagram and an updated adjacent matrix into a characteristic conversion module of a convolution layer of the diagram, performing linear transformation on the node characteristic matrix, mapping node characteristics to a characteristic space through multiplication operation of a weight matrix and the node characteristic matrix, and generating a transformed node characteristic matrix; Symmetric normalization processing is carried out on the updated adjacent matrix, the sum of the updated adjacent matrix and the identity matrix is calculated to obtain an adjacent matrix with a self-loop, and degree matrix normalization processing is carried out on the adjacent matrix with the self-loop to generate a symmetric normalized adjacent matrix; Performing matrix multiplication operation on the symmetric normalized adjacent matrix and the transformed node feature matrix to complete the aggregation operation of the neighborhood node features and generate a preliminary aggregation feature matrix; Performing nonlinear activation processing on the preliminary aggregation feature matrix, performing activation operation on each element of the preliminary aggregation feature matrix based on an activation function, enhancing nonlinear expression capability of features, and generating an activated aggregation feature matrix; constructing a feature aggregation residual connection, and performing element-level addition operation on the transformed node feature matrix and the activated aggregation feature matrix to generate a residual enhancement aggregation feature matrix, wherein the residual enhancement aggregation feature matrix is used for relieving the gradient vanishing problem of a deep network; and carrying out batch normalization processing on the residual enhanced aggregate feature matrix, calculating the mean value and the variance of each feature channel, and carrying out standardization processing on the feature values to generate an aggregate feature matrix, wherein the feature distribution of the aggregate feature matrix is more stable.
- 8. The method of claim 1, wherein the extracting vibration spectrum data in a navigation data record corresponding to the topology embedded vector in the joint acquisition data set, performing cross-modal alignment processing on the topology embedded vector and the vibration spectrum data, and generating an alignment enhancement feature vector, includes: Screening navigation data records corresponding to the topology embedded vectors from the combined acquisition data set, and matching vehicle-mounted inertial navigation data with the same time stamp according to the generation time stamp of the topology embedded vectors to obtain corresponding navigation data records; Performing frequency spectrum analysis on the vibration acceleration data in the navigation data record, and converting the vibration acceleration data from a time domain signal to a frequency domain signal to generate vibration spectrum data, wherein the vibration spectrum data comprises amplitude values of different frequency components; Performing feature extraction processing on the vibration spectrum data, extracting peak frequency, spectrum center of gravity and bandwidth parameters in the vibration spectrum data, and generating a vibration spectrum feature vector, wherein the dimension of the vibration spectrum feature vector is consistent with the feature dimension of the topology embedding vector; Inputting the topological embedded vector and the vibration spectrum feature vector into a dynamic time regulation module, calculating a feature distance matrix of the topological embedded vector and the vibration spectrum feature vector, searching an optimal alignment path through a dynamic programming algorithm, and generating a pair Ji Lujing matrix; Performing time dimension alignment processing on the topological embedded vector and the vibration spectrum feature vector according to the alignment path matrix, rearranging the feature sequence of the topological embedded vector and the feature sequence of the vibration spectrum feature vector according to the alignment path matrix, and generating an aligned bimodal feature matrix; and carrying out feature fusion processing on the aligned bimodal feature matrix, and fusing the aligned topological embedded vector with the vibration spectrum feature vector based on feature connection operation to generate an aligned enhancement feature vector, wherein the aligned enhancement feature vector contains cross-modal association information of the image and vibration.
- 9. The method of claim 8, wherein the performing a spectral analysis on the vibration acceleration data in the navigation data record, converting the vibration acceleration data from a time domain signal to a frequency domain signal, generating vibration spectral data, comprises: Preprocessing the vibration acceleration data in the navigation data record, removing a direct current component in the vibration acceleration data, filtering the vibration acceleration data, and reserving an alternating current component; the preprocessed vibration acceleration data are subjected to sectional processing according to a preset time window, so that a plurality of continuous vibration data segments are generated, and the length of each vibration data segment is the same; Windowing is carried out on each vibration data segment, weighting is carried out on the vibration data segments based on a Hamming window function, the frequency spectrum leakage effect is reduced, and the windowed data segments are generated; Performing fast Fourier transform processing on the windowed data segment, converting the vibration acceleration signal of the time domain into a complex frequency spectrum of the frequency domain, calculating the modulus value of the complex frequency spectrum, and generating an amplitude frequency spectrum; Extracting a frequency axis and amplitude values in the amplitude spectrum, dividing the frequency axis into a plurality of frequency intervals, calculating an average amplitude value in each frequency interval, and generating vibration spectrum data, wherein the transverse axis of the vibration spectrum data is the frequency interval, and the longitudinal axis of the vibration spectrum data is the average amplitude value; and smoothing the vibration spectrum data, and carrying out sliding window average processing on the amplitude value of the vibration spectrum data based on a moving average filter to generate smoothed vibration spectrum data.
- 10. A computer system comprising a memory and a processor, the memory storing a computer program executable on the processor, wherein the processor implements the steps of the method of any one of claims 1 to 9 when the program is executed.
Description
Road network disease image recognition method and system based on deep learning Technical Field The invention relates to the field of image processing, in particular to a road network disease image recognition method and system based on deep learning. Background Along with the increase of the maintenance requirements of road infrastructure, the automatic identification of the types and the degrees of diseases such as cracks, pits, sedimentation and the like can be realized through the analysis and the processing of the image data of the road surface, and the identification efficiency is improved by widely combining a computer vision and a deep learning method at present. In the prior art, the definition splitting and structuring association of images and geometric features are insufficient in a feature processing stage, so that the utilization rate of mutual information of multidimensional information in feature expression is limited, and meanwhile, in the feature modeling process, priori knowledge such as road network topology structures and the like are not fused sufficiently, so that association rules of the lesion features in road network space are difficult to capture accurately, and further, the accuracy and reliability of lesion recognition are required to be improved. Disclosure of Invention The invention aims to provide a road network disease image recognition method and system based on deep learning. The technical scheme of the embodiment of the invention is realized as follows: The invention provides a road network disease image identification method based on deep learning, which comprises the steps of collecting continuous image sequences of a road surface and vehicle-mounted inertial navigation data in a corresponding period, integrating the continuous image sequences and the vehicle-mounted inertial navigation data into a joint collection data set through timestamp synchronous processing, wherein each data unit in the joint collection data set comprises image frames at the same collection time and corresponding navigation data records, performing spectrum-geometry dual-domain decoupling processing on the image frames in the joint collection data set to obtain spectral feature components and geometric structure components of each image frame, performing association mapping processing on the spectral feature components and the geometric structure components to generate a cross-domain feature matrix, inputting the cross-domain feature matrix into a pre-trained topological perception graph neural network, performing graph structure modeling processing on the cross-domain feature matrix through combining road network node connectivity information, generating a topological embedded vector through iterative updating node association strength, extracting vibration data in the navigation data records corresponding to the topological embedded data sets, performing alignment of the topological embedded vector and the topological embedded vector, performing alignment processing on the topological embedded vector and the geometric feature vector, and performing severe alignment processing on the vibration feature vector to generate a recognition result. In a second aspect, the present invention provides a computer system comprising a memory and a processor, the memory storing a computer program executable on the processor, the processor implementing the steps of the method described above when the program is executed. According to the invention, the continuous image sequence and the vehicle-mounted inertial navigation data of the road surface are collected and subjected to time stamp synchronous processing, the continuous image sequence and the vehicle-mounted inertial navigation data are integrated into the combined collection data set, so that the vision data and the physical sensing data form a space-time binding unified data unit, the spectrum-geometry dual-domain decoupling processing is performed on the image frames in the combined collection data set, the spectrum characteristic components and the geometry structural components are obtained, the cross-domain characteristic matrix is generated by carrying out association mapping, the fine splitting and structural association of the spectrum and the geometry characteristics are realized, and the richness and the accuracy of the characteristic expression are improved. The cross-domain feature matrix is input into a topological perception graph neural network, graph structure modeling is conducted by combining road network node connectivity information, and a topological embedded vector is generated by iteratively updating node association strength, so that the graph network merges road network topology priori knowledge and adaptively learns propagation association rules of disease features in the road network, the defect that the road network physical connection relationship is ignored in traditional feature extraction is overcome, and dynamic associat