CN-121999456-A - Long-tailed pavement identification method based on wavelet space decoupling transducer and progressive association learning
Abstract
The invention provides a recognition method of a long-tailed pavement based on wavelet space decoupling transition and progressive association learning, which is characterized in that a transition network recognition model WSDT based on wavelet space decoupling is constructed, a correlation sensing progressive learning strategy is adopted for two-stage training, and the trained network recognition model is deployed on a vehicle to realize recognition of the long-tailed pavement. The WSDT architecture adopts a layered design from texture to semantic, and the high efficiency of a convolution space attention and a wavelet space decoupling module is matched, so that the severe requirement of an automatic driving system on real-time perception is met; according to the invention, the image is decomposed into high-frequency and low-frequency components in the frequency domain, the model can explicitly capture fine high-frequency texture features such as edges, cracks, water film reflection and the like, and meanwhile, the space decoupling convolution in the horizontal and vertical directions is combined, so that the receptive field is enlarged on the premise of not remarkably increasing the calculated amount.
Inventors
- ZHANG RONGHUI
- WU YUNFAN
Assignees
- 中山大学
Dates
- Publication Date
- 20260508
- Application Date
- 20260129
Claims (10)
- 1. The long-tailed pavement identification method based on wavelet space decoupling transducer and progressive association learning is characterized by comprising the following steps of: S1, acquiring pavement image data by using an on-board camera, and preprocessing the pavement image data; S2), constructing a transducer network identification model WSDT based on wavelet space decoupling; The wavelet space decoupling-based transform network identification model WSDT comprises an input layer, a local texture coding module, a local-global context coding module and an output head; The local texture coding module comprises a first stage and a second stage, wherein the first stage and the second stage of the local texture feature extraction module are formed by stacking a convolution space attention module CSAB and a wavelet space decoupling module WSDB; the local-global context coding module comprises a third stage and a fourth stage, wherein the third stage and the fourth stage are formed by stacking a convolution space attention module CSAB and a global semantic conversion module GSTB; s3), performing two-stage training on a transform network identification model WSDT based on wavelet space decoupling by adopting a correlation perception progressive learning strategy CaPL; S4), deploying a trained transducer network identification model WSDT based on wavelet space decoupling on a vehicle to realize the identification of the long-tailed pavement.
- 2. The long-tailed pavement recognition method based on wavelet space decoupling transformers and progressive association learning of claim 1 is characterized in that in a first stage, initial feature embedding is processed through a stacked convolution space attention module CSAB, local space features are extracted by utilizing the high efficiency of convolution, then the features are explicitly decomposed into high-frequency and low-frequency components through a wavelet space decoupling module WSDB by utilizing discrete wavelet transform DWT, and high-frequency texture details of micro cracks and broken stone edges of a pavement are captured to obtain local space texture features; in the second stage, the local space texture features are subjected to downsampling by a depth convolution downsampling module Conv-DS, and the downsampled features continue to be subjected to frequency domain feature capture of road surface textures under different scales by a stacked convolution space attention module CSAB and a terminal wavelet space decoupling module WSDB to obtain enhanced local space texture features.
- 3. The long-tailed pavement recognition method based on wavelet space decoupling Transformer and progressive association learning of claim 2 is characterized in that in a third stage, enhanced local spatial texture features are sampled by a deep convolution downsampling module Conv-DS, then local features are extracted by a convolution spatial attention module CSAB, then long-distance dependency relations in images are captured by a self-attention mechanism of a global semantic Transformer module, transition from texture details to global semantics is realized, and transitional semantic features are obtained; In the fourth stage, the transition semantic features are downsampled by a two-way attention downsampling module DP-Att-DS in parallel by using a convolution path and an attention path, and then the high-level semantic category information of the road surface image is deeply mined by a stacked convolution space attention module CSAB and a global semantic converter module GSTB to obtain the global semantic features.
- 4. The long-tailed pavement recognition method based on wavelet space decoupling transform and progressive association learning as set forth in claim 3, wherein for the first step Stage 1 A plurality of convolution spatial attention modules CSAB, which output local spatial texture features Expressed as: ; ; in the formula, Representing intermediate features after the convolution spatial attention CSA processing; Represent the first Stage 1 Local spatial texture features of the convolved spatial attention modules CSAB; representing a lightweight multi-layer perceptron for feature nonlinear transformation; representing convolved spatial attention.
- 5. The long-tailed pavement recognition method based on wavelet space decoupling transform and progressive association learning as claimed in claim 4, wherein the convolution space attention is characterized in that Generating a spatial attention map by depth convolution and acting on a value vector The spatial attention features are finally obtained.
- 6. The long-tailed pavement recognition method based on wavelet space decoupling transform and progressive association learning as set forth in claim 3, wherein the wavelet space decoupling module WSDB is located in a shallow layer of the network recognition model, the wavelet space decoupling module WSDB enhances local space texture features by discrete wavelet transform DWT and space decoupling convolution for the first Stage 1 A wavelet spatial decoupling module WSDB, which outputs enhanced local spatial texture features Expressed as: ; ; in the formula, Is the first Stage 1 Enhanced local spatial texture features output by the wavelet spatial decoupling module WSDB; Representing warp An intermediate feature; representing wavelet spatial decoupling attention for decoupling enhanced texture features in the frequency and spatial domains.
- 7. The long-tailed pavement recognition method based on wavelet space decoupling transducer and progressive association learning as claimed in claim 6, wherein the wavelet space decoupling attention is characterized in that Decoupling processing is carried out on the enhanced texture features in a frequency domain and a space domain so as to capture high-frequency texture information, and the method specifically comprises the following steps: s21), enhanced texture feature pass through for input After convolution, discrete wavelet transform DWT is carried out, the DWT is decomposed into high-frequency sub-bands and low-frequency sub-bands, and then the DWT is carried out The convolution processing is carried out to obtain: ; in the formula, Representing frequency domain features; A convolution operation with batch normalized BN with a kernel size of k is represented; representing a convolution with BN and an activation function; s22), utilizing the vertical convolution V-Conv and the horizontal convolution H-Conv to frequency domain characteristics Spatial decoupling, up-sampling to restore resolution, and generating attention map via Sigmoid function The method comprises the following steps: ; in the formula, Representing an upsampling operation; S23), will notice the force diagram Vector of AND values Multiplying by element, splicing with the feature restored by inverse discrete wavelet transform IDWT, and fusing by linear layer to obtain linear fusion feature : ; ; In the formula, Representing a linear fusion feature; Representing a stitching operation.
- 8. The long-tailed pavement recognition method based on wavelet space decoupling transform and progressive association learning as set forth in claim 3, wherein said global semantic transform module GSTB is located in a deep layer of recognition network model for capturing global semantic information for the first Stage 1 Global semantic conversion module GSTB outputs global semantic features Expressed as: ; ; in the formula, Represent the first Stage 1 Global semantic features of the global semantic Transformer module GSTB; representing intermediate features that are globally semantic self-attention processed; representing global semantic self-attention.
- 9. The long-tailed pavement recognition method based on wavelet space decoupling transform and progressive association learning as claimed in claim 8, wherein the global semantic self-attention is characterized in that Global semantic features Mapping into queries, keys, and value vectors; Then calculate a query based on the multi-Head attention and Talking-Head mechanism And key Through the Talking-Head mechanism, promote the information interaction between heads, obtain global semantic attention map ; Weighting the value vector and introducing Depth convolution enhances locality to obtain global semantic self-attention features 。
- 10. The long-tailed pavement recognition method based on wavelet space decoupling transform and progressive association learning according to claim 1, wherein the correlation sensing progressive learning strategy CaPL is characterized in that the correlation sensing progressive learning strategy is gradually transited from general feature learning to long-tailed adaptation learning, and a designed balanced similarity sensing binary cross entropy loss function optimization model is used, and the construction of the balanced similarity sensing binary cross entropy loss function specifically comprises the following steps: s31), constructing a category similarity sparse matrix, which specifically comprises the following steps: S311 is provided with Is the first Feature vectors of individual training samples, for each class Calculating the category characteristic center The method comprises the following steps: ; in the formula, Is a median function; Represent the first True class labels of the individual training samples; S312), calculating any two categories And Cosine similarity between feature centers of (a) to obtain an initial similarity matrix The method comprises the following steps: in the formula, Representing categories And Is used for the initial similarity matrix of the image; 、 Respectively of the category Feature centers and categories of (a) Is a feature center of (2); s313), performing Top-k sparsification on each column of the initial similarity matrix to obtain a similarity sparse matrix The method comprises the following steps: ; in the formula, Represent the first The highest degree of similarity in the columns Index sets of individual categories; Is the first Category 1 A sparse matrix of columns; s32), calculate each category Association exposure with other categories Then, the similarity perception deviation is generated through a negative inhibition mechanism The method comprises the following steps: ; ; ; in the formula, Is a super parameter for controlling the deviation intensity; Is of the category Is a related exposure degree of (2); Is a mean normalization term; s33), calculating class frequency prior deviation Introducing a frequency deviation based on Bayesian inference, namely: ; ; in the formula, ; Is a category of Is used for the number of samples of (a), Is the total sample number; s34) combining the original identification network model output Class frequency prior bias And similarity sensing bias Obtaining the final corrected Logit value And calculating the final balance similarity perception binary cross entropy loss function The method comprises the following steps: in the formula, Is a real label; Is an activation function; Is the total category number.
Description
Long-tailed pavement identification method based on wavelet space decoupling transducer and progressive association learning Technical Field The invention relates to the technical field of automatic driving environment sensing and artificial intelligence, in particular to a long-tail pavement recognition method based on wavelet space decoupling transform and progressive association learning. Background With the rapid development of automatic driving and intelligent traffic systems, the core functional modules of the vehicle already cover a plurality of aspects such as high-precision positioning navigation, dynamic environment sensing, intelligent control and the like. The environment perception is taken as a basic component, an active safety system and track planning of the vehicle are supported, and the active safety system and track planning play a vital role in ensuring safe and efficient operation of the vehicle. In particular, the real-time sensing of the road surface conditions (such as the road surface material type, the dry and wet state and the like) can provide key information for the acceleration, steering and braking decision of the vehicle. Different road surface types exhibit distinct roughness characteristics and coefficients of friction, which directly affect the dynamics and stability of the vehicle. Research shows that if the type of the road surface in front can be predicted in advance, the vehicle can obtain additional decision time, so that the performance of the active safety control and trajectory planning algorithm is obviously improved, and the riding comfort and safety are improved. In addition, changeable weather conditions can change the humidity degree of road surface, and then change the adhesion coefficient of tire and road surface, influence braking and handling performance. Therefore, the accurate identification of the road surface type and the dry and wet or slippery state thereof is of great importance to the guarantee of the active safety of the vehicle and the reliability of track planning. The existing road surface recognition technology is mainly divided into three technical routes, namely 1) estimating the friction coefficient of the tire and the road surface based on a vehicle dynamics model, namely reversely pushing the adhesion condition of the tire and the road surface through the vehicle dynamics model. 2) Road surface roughness recognition based on vehicle vibration response by analyzing vibration signals during vehicle travel to infer road surface conditions. 3) And the image recognition technology based on computer vision is to acquire images by using a vehicle-mounted camera and classify the images by a machine learning or deep learning algorithm. The first two methods mainly pay attention to the road surface condition of the current position of the vehicle, lack the prediction capability of the road condition in front, generally require complex modeling and calibration, and limit real-time application. In contrast, the vision-based method gradually becomes a mainstream solution for sensing the forward road conditions due to the capability of predicting the forward road conditions, non-contact measurement and good real-time performance. Currently mainstream vision methods include Convolutional Neural Network (CNN) based schemes and vision based Transformer (ViT) schemes. Although a visual pavement recognition method based on deep learning has been advanced to a certain extent, in practical application, the method still faces the following significant technical bottlenecks and defects that the existing model is difficult to combine fine grain texture extraction and global semantic modeling, so that the recognition rate of a complex pavement is low, the real scene data has serious long tail distribution (class imbalance), so that safety key class recognition fails, high semantic similarity exists among classes, and confusion is easy to occur in the existing method. Disclosure of Invention Aiming at the defects of the prior art, the invention provides a long-tailed pavement identification method based on wavelet space decoupling converger and progressive association learning. The technical scheme of the invention is that the long-tailed pavement recognition method based on wavelet space decoupling transducer and progressive association learning comprises the following steps: S1, acquiring pavement image data by using an on-board camera, and preprocessing the pavement image data; S2), constructing a transducer network identification model WSDT based on wavelet space decoupling; s3), performing two-stage training on a transform network identification model WSDT based on wavelet space decoupling by adopting a correlation perception progressive learning strategy CaPL; S4), deploying a trained transducer network identification model WSDT based on wavelet space decoupling on a vehicle to realize the identification of the long-tailed pavement. Preferably, the wavelet s