CN-121982435-A - Pavement category identification model construction method based on multichannel depth convolution and application

CN121982435ACN 121982435 ACN121982435 ACN 121982435ACN-121982435-A

Abstract

The application provides a pavement category identification model construction method and application based on multichannel depth convolution, comprising the following steps of obtaining a plurality of pavement images marked with pavement types as a training sample set; constructing a pavement type recognition architecture, training the pavement type recognition architecture by using a training sample set, wherein the pavement type recognition architecture comprises an initial feature extraction network, a depth feature extraction network and a classifier, finishing training when iteration stop conditions are reached, and storing optimal parameters of the pavement type recognition architecture to obtain a pavement type recognition model. According to the scheme, channel division and differentiation processing are carried out on the input features through channel depth convolution, invalid calculation is reduced, meanwhile, a modulation coefficient is generated based on global context through a lightweight gating network, self-adaptive modulation of the feature channels is achieved, contribution degree of effective features is improved, and pertinence of feature expression is enhanced.

Inventors

WANG JUNCHENG
LI ZHI
ZHANG LENING
XU YIMAN

Assignees

浙江理工大学
浙江亚之星汽车部件有限公司

Dates

Publication Date: 20260505
Application Date: 20260409

Claims (10)

1. The pavement category identification model construction method based on the multichannel depth convolution is characterized by comprising the following steps of: Acquiring a plurality of pavement images marked with pavement types as a training sample set; Constructing a pavement type recognition architecture based on MobileNetV < 4 >, and training the pavement type recognition architecture by using a training sample set, wherein the pavement type recognition architecture comprises an initial feature extraction network, a depth feature extraction network and a classifier; in the training process of a road type training framework, an initial feature extraction network adopts a convolution normalization layer to extract features of a road image to obtain initial road features, a depth feature extraction network adopts a plurality of serially connected general inversion bottleneck blocks to extract features of the initial road features to obtain depth road features, a sub-channel depth convolution is adopted in the general inversion bottleneck blocks to process input features to obtain an intermediate feature map, spatial attention calculation is carried out on the intermediate feature map to obtain output features of the general inversion bottleneck blocks, the output features of the last general inversion bottleneck block are depth road features, the sub-channel depth convolution divides corresponding input features into first sub-features and second sub-features in channel dimensions by preset division factors, the first sub-features are subjected to depth separable convolution to obtain first depth sub-features, and a gating network is used to respectively obtain first modulation coefficients corresponding to the first depth sub-features and second modulation coefficients corresponding to the second sub-features; updating parameters of the pavement type recognition architecture based on the pavement type recognition result and the corresponding labels, ending training when iteration stop conditions are reached, and storing optimal parameters of the pavement type recognition architecture to obtain a pavement type recognition model.
2. The pavement category recognition model construction method based on the multichannel depth convolution as claimed in claim 1, wherein the general inversion bottleneck block comprises a spatial feature extraction unit, a middle depth convolution unit, a spatial attention calculation unit and a residual error output unit, the spatial feature unit extracts spatial features of input features and inputs the extracted spatial features into the middle depth convolution unit, the middle depth convolution unit carries out the multichannel depth convolution on the input features to obtain a middle feature map, the spatial attention calculation unit carries out spatial attention calculation on the middle feature map to obtain a spatial attention result, and the residual error output unit outputs the spatial attention result after being connected with the input feature residual error of the general inversion bottleneck block.
3. The pavement class identification model construction method based on the multichannel depth convolution as claimed in claim 2 is characterized in that the step length of each intermediate depth convolution unit is preset, the multichannel depth convolution is carried out on the input features to obtain an intermediate feature map if the step length is equal to 1, and the depth separable convolution is carried out on the input features to obtain the intermediate feature map if the step length is greater than 1.
4. The method for constructing the pavement category recognition model based on the multichannel depth convolution according to claim 1 is characterized in that the first sub-feature, the second sub-feature and the first depth sub-feature are subjected to global average pooling respectively and then are spliced to obtain a combined context feature vector, and a gating network is adopted to generate a first modulation coefficient and a second modulation coefficient based on the combined context feature vector.
5. The method for constructing the pavement class identification model based on the multichannel deep convolution according to claim 1, wherein in the spatial attention calculation, the high attention weight of the intermediate feature map in the high dimension and the width attention weight of the intermediate feature map in the width dimension are acquired, and the intermediate feature map, the high attention weight and the width attention weight are output after being multiplied element by element.
6. The method for constructing the pavement category recognition model based on the multichannel depth convolution according to claim 5 is characterized in that average pooling, maximum pooling and pixel variance calculation are respectively carried out on the height dimension of the intermediate feature map, the average pooling result, the maximum pooling result and the pixel variance calculation result on the height dimension are added to obtain the height attention weight, average pooling, maximum pooling and pixel variance calculation are respectively carried out on the width dimension of the intermediate feature map, and the average pooling result, the maximum pooling result and the pixel variance calculation result on the width dimension are added to obtain the width attention weight.
7. The pavement class identification model construction method based on the multichannel depth convolution according to claim 1 is characterized in that a spatial attention output feature is input into a residual error output unit in a general inversion bottleneck block, the residual error unit firstly carries out 1×1 convolution on the spatial attention output feature by adopting a projection layer to obtain a projection result, and then carries out residual error connection on the projection result and the input feature of the corresponding general inversion bottleneck block to serve as the output feature of the general inversion bottleneck block, wherein the spatial attention output feature is a corresponding spatial attention calculation result.
8. The application method of the pavement category recognition model based on the multichannel depth convolution is characterized by comprising the following steps of: obtaining a pavement image to be identified, and inputting the pavement image to be identified into a pavement category identification model constructed by the method of any one of claims 1-7 to obtain a pavement category identification result.
9. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, the processor being arranged to run the computer program to perform a method of constructing a road class identification model based on a multichannel depth convolution as claimed in any one of claims 1 to 7.
10. A readable storage medium, wherein a computer program is stored in the readable storage medium, and when executed by a processor, the computer program implements a pavement class identification model construction method based on a multichannel depth convolution as defined in any one of claims 1-7.

Description

Pavement category identification model construction method based on multichannel depth convolution and application Technical Field The application relates to the field of pavement identification, in particular to a pavement type identification model construction method based on multichannel depth convolution and application thereof. Background At the moment of rapid development of intelligent cities and intelligent traffic systems, accurate perception of vehicle environments becomes a core link of intelligent driving landing and traffic efficiency improvement, road surface type recognition serves as a key component part of a vehicle perception system, and the accuracy and reasoning instantaneity of recognition directly determine the running safety of vehicles, the rationality of path planning and the execution effect of intelligent control strategies, so that the intelligent traffic system becomes a research focus in the intelligent traffic field. The conventional road surface type recognition technology is mainly divided into three types of contact type measurement, measurement based on system response and non-contact type measurement, wherein various methods have obvious technical shortboards, namely the contact type measurement is directly contacted with a road surface by means of physical sensors such as an accelerometer, the measurement precision is higher at low speed, but the equipment investment cost is high, the real-time performance is poor, the vehicle dynamic running requirement is difficult to adapt, the recognition is realized by indirect data such as vehicle suspension vibration response based on the system response, the real-time performance is improved, the problem of abrupt road section response lag is easy to cause recognition deviation, the non-contact type measurement is realized by collecting road surface images or point cloud data by means of visual sensors such as a camera and a laser radar and combining with a deep learning model, the road surface type classification is the main flow direction of the prior art development, the problem that the model is generally large in parameter quantity and the reasoning speed is insufficient due to the fact that the LSTM network and the SVM are combined with EFFICIENTNET neural network is difficult to be deployed on vehicle-mounted embedded equipment in high efficiency in related researches. The lightweight convolutional neural network MobileNet series greatly reduces the calculated amount and the parameter number by means of depth separable convolution, realizes high-efficiency reasoning, is widely applied to the field of computer vision such as target detection and the like, and effectively improves model generalization capability and prediction precision based on MobileNetV3 related research. However, the conventional MobileNet series model has the problems of simple network structure and limited feature expression capability, mobileNetV further optimizes the model structure, shows advantages in balance complexity and performance, adapts to road classification scenes sensitive to computing resources, easily has the problems of computing amount and delay rise when the depth of the model is increased, and limits the application of the model in road recognition scenes with high precision and high real-time performance. At present, mobileNetV4 is applied to less research on road surface type classification, and the original structure is difficult to meet the high requirements of complex road surface scenes on feature robustness and fine perceptibility while extremely lightening under the condition that no additional optimization is introduced, and cannot cope with the practical problems of changeable road surface textures, complex illumination and weather conditions and the like, so that the exploration of a road surface type identification method based on the lightening MobileNetV4 becomes a key requirement for promoting the practicability and the landing of a road surface identification technology. Disclosure of Invention The embodiment of the application provides a pavement category identification model construction method and application based on multichannel deep convolution, which are used for carrying out channel division and differentiation processing on input features through the multichannel deep convolution, reducing invalid calculation, generating a modulation coefficient based on global context through a lightweight gating network, realizing self-adaptive modulation of feature channels, improving contribution of effective features and enhancing pertinence of feature expression. In a first aspect, an embodiment of the present application provides a method for constructing a pavement class identification model based on a multichannel deep convolution, where the method includes: Acquiring a plurality of pavement images marked with pavement types as a training sample set; Constructing a pavement type recognition architecture based on Mob