CN-122023422-A - Bridge crack detection method and system based on improvement YOLOv11
Abstract
The invention discloses a bridge crack detection method and system based on an improvement YOLOv (YOLOv) 11, and belongs to the technical field of bridge safety. According to the invention, YOLOv is taken as a basic frame, and performance improvement is realized by introducing a ADown module for optimizing feature downsampling efficiency, a CGLU module for enhancing deep feature expression capability, an EMA module for improving feature stability and a EFFICIENTHEAD module for optimizing detection precision and speed balance, so that efficient recognition of cracks under thin, small-scale and complex backgrounds is realized in bridge crack detection. The bridge crack target detection method can be integrated and embedded into a visual real-time monitoring platform, and the visual real-time monitoring platform are cooperated to realize visual presentation of a crack real-time identification and positioning result from the bridge surface image acquisition, so that an intuitive and efficient on-site monitoring tool is provided for engineers, and the real-time inspection requirement of a bridge structure is met.
Inventors
- YE LING
- SONG BING
- ZHOU WENYI
- LI CHAO
- TIAN TIAN
Assignees
- 华东交通大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260414
Claims (7)
- 1. The bridge crack detection method based on the improvement YOLOv is characterized by comprising the following steps of: s1, constructing YOLOv a basic framework, wherein the basic framework comprises a backbone network, a neck network and a detection head, and the three core parts are connected through an information flow to form a complete detection flow; S2, integrating a ADown downsampling module, a CGLU convolution gating linear unit, an EMA attention mechanism module and a EFFICIENTHEAD module at YOLOv to form an improved YOLOv model; S3, processing the bridge crack image through improving YOLOv model to realize the identification and positioning of the crack area, wherein the concrete process is as follows: S31, the backbone network adopts a progressive downsampling strategy, the spatial resolution of an input image is gradually reduced through convolution operation, simultaneously, the dimension of a feature channel is expanded, the calculation efficiency and the feature expression capacity are balanced, and the semantic features of the image are extracted; s32, the neck network adopts a bidirectional fusion architecture combining a feature pyramid network FPN and a path aggregation network PAN, and performs cross-scale integration on high-level semantic features and shallow-level spatial features output by a backbone network to construct a multi-scale feature pyramid; S33, the detection head receives a multi-scale feature pyramid constructed by the neck network, completes bounding box regression, category judgment and confidence prediction of the target, and converts the high-dimensional feature tensor into a target detection result; S34, the ADown downsampling module, the CGLU convolution gating linear unit and the EMA attention mechanism module respectively play roles of downsampling efficiency optimization, deep feature enhancement and feature stability improvement in the feature extraction and fusion process, and the EFFICIENTHEAD module optimizes detection precision and speed balance.
- 2. The bridge crack detection method based on the improvement YOLOv as set forth in claim 1, wherein the ADown downsampling module adopts a double-branch structure, including an average pooling branch and a convolution branch, and the specific processing procedure is as follows: s311, to input characteristic diagram Performing initial average pooling operation with the core size of 2 and the step length of 1 to obtain a feature map X' with the size of (H-1) X (W-1) and the channel number unchanged, wherein the average pooling operation formula is as follows: ; wherein AvgPool d represents a two-dimensional average pooling function; s312, uniformly dividing the feature map X' into two parts X 1 and X 2 along the channel dimension through a Chunk function: ; Wherein the Chunk function represents a uniform segmentation along a specified dimension, ; S313, performing feature extraction on X 1 by adopting 3X 3 convolution to output Y 1 , performing 3X 3 maximum pooling on X 2 , and then performing 1X 1 convolution processing to output Y 2 ; S314, splicing Y 1 and Y 2 along the channel dimension by the following method Obtaining a final output characteristic diagram Wherein C 2 is the target output channel number.
- 3. The bridge crack detection method based on the improvement YOLOv as set forth in claim 2, wherein the CGLU convolution gating linear unit includes a channel expansion and segmentation unit, a gating convolution unit, and a feature recovery unit, and the specific processing procedure is as follows: S315, carrying out channel expansion on an input feature map through 1X 1 convolution, expanding the number of channels to 4/3 times of the original dimension, uniformly dividing the expanded feature map into two parts along the channel dimension, and respectively serving as a basic feature stream and a gating signal stream, wherein the process is formally described as follows: ; Wherein Conv1×1 represents a 1×1 convolution operation, splitt represents a channel segmentation function, X in is an input feature map, and F base and F gate respectively represent a segmented basic feature and a gating feature; s316, a gating convolution unit performs depth separable convolution and nonlinear activation on the basic feature stream, and then performs element-by-element multiplication on the gating feature stream, wherein the process is expressed as: ; wherein DWConv.times.3 denotes a 3.times.3 depth separable convolution, GELU is a Gaussian error linear element activation function, as indicated by element-wise multiplication; S317, a feature recovery unit recovers the processed number of feature channels to the original dimension through 1X 1 convolution, and performs residual connection with the original input feature, and finally CGLU convolution gating linear units output as follows: ; wherein Dropout operation randomly discards a portion of neurons during the training phase for enhancing generalization ability of the model.
- 4. The bridge crack detection method based on the improvement YOLOv as set forth in claim 3, wherein the EMA attention mechanism module interacts with the bidirectional path through grouping feature reconstruction to realize cross-scale adaptive feature enhancement, and the specific process is as follows: S321, giving an input feature map Wherein B, C, H, W respectively represents batch size, channel number, height and width, and the EMA attention mechanism module performs grouping reconstruction on the feature map X along the channel dimension to obtain grouping features X g , which are specifically formed as follows: ; Wherein G is the number of packets, set to 8; s322, capturing space statistical information through bidirectional space pooling, and establishing association relation among channels by utilizing convolution operation, wherein the specific formula is as follows: ; Wherein P h and P w represent adaptive mean pooling in the height and width directions, respectively, conv represents convolution operation, concat represents feature stitching operation; S323, the height direction pooling output shape is (B X G, C/G, H, 1), the width direction pooling output shape is (B X G, C/G,1, W), the height direction pooling output shape is dimension transposed to change the shape into tensor of (B X G, C/G, W, 1), the two tensors are spliced along a third dimension to obtain tensors of (B X G, C/G, H+W, 1), and then the tensors are subjected to convolution processing of 1X 1; S324, realizing self-adaptive weighting among features through a dual-path interaction mechanism, wherein an EMA attention mechanism module divides a feature map into two parallel paths, a first path multiplies a space attention weight and grouping features element by element, feature enhancement is carried out through group normalization to obtain X 1 , and a second path extracts local detail features through 3X 3 convolution to obtain X 2 ; S325, calculating attention weights through cross paths, namely, firstly carrying out global average pooling on X 1 to obtain a shape (B×G, C/G, 1), reshaping to obtain (B×G, C/G, 1), transposing to obtain (B×G,1, C/G), activating by Softmax to obtain X 11 , reshaping X 2 to obtain X 12 , calculating the matrix product of X 11 and X 12 , carrying out the same global average pooling, dimension reshaping, transposing and Softmax activating operation on X 2 to obtain X 21 , reshaping X 1 to obtain X 22 , calculating the matrix product of X 21 and X 22 , adding the two path matrix products to obtain (B×G,1, H, W), and activating by Sigmoid to obtain an attention weight map W; S326, multiplying the attention weight W with the grouping feature X g element by element, and outputting an enhanced feature map: ; Wherein, the Representing element-wise multiplication.
- 5. The bridge crack detection method based on the improvement YOLOv as set forth in claim 4, wherein the EFFICIENTHEAD module is composed of a feature refining module, a bounding box regression branch and a crack classification branch, and the specific processing procedure is as follows: S331, a feature refining module respectively carries out two continuous 3×3 packet convolution processes on feature graphs from three different scales of a backbone network, namely P3, P4 and P5: ; Where F i represents the input feature map for the ith scale layer, Representing a grouping convolution operation, wherein W g1 and W g2 are weight matrixes of two convolution layers respectively, and the grouping number is set to be 1/16 of the channel number; S332, converting the traditional coordinate regression problem into discrete probability distribution prediction by adopting a distributed focus loss DFL mechanism by the boundary box regression branch, outputting 4 groups of discrete distribution vectors by the regression branch for each preset anchor point frame, and obtaining the final boundary box coordinate through integral operation: ; Wherein P (n) represents the predicted probability of the nth discrete value and is obtained through Softmax function normalization; The continuous coordinate values obtained by integration are then converted into actual image coordinates through linear transformation; s333, directly outputting confidence scores of crack classification branches belonging to crack categories at each position through a 1X 1 convolution layer; S334, splicing the outputs of the boundary box regression branch and the crack classification branch in the channel dimension to form a complete detection result.
- 6. The bridge crack detection system based on the improvement YOLOv is characterized by comprising an image acquisition module, an image processing unit, a data storage module and a result display module; The image acquisition module is used for acquiring crack images of the bridge; an image processing unit for executing computer instructions stored in a memory, integrating the bridge crack detection method based on the improvement YOLOv according to any one of claims 1-5; The data storage module is connected with the image processing unit and used for storing the input original image, intermediate data in the algorithm processing process, a final quantization result and a visual image; And the result display module is connected with the image processing unit and used for displaying the processing result to a user.
- 7. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-5.
Description
Bridge crack detection method and system based on improvement YOLOv11 Technical Field The invention relates to the technical field of bridge safety, in particular to a bridge crack detection method and system based on an improvement YOLOv < 11 >. Background The crack is used as the most common disease form in the bridge structure, and early and accurate detection and identification are key links for evaluating the health state of the bridge structure and preventing potential safety hazards. Traditional bridge crack detection mainly relies on manual inspection, and the problems of low efficiency and high labor intensity exist, and the detection is easily influenced by experience level of detection personnel to cause deviation. In addition, when working in dangerous areas such as high altitude, underwater and the like, the safety of workers is difficult to ensure. The existing crack detection technology comprises an edge detection method based on traditional machine vision, a target detection and segmentation method based on deep learning and the like, but obvious short plates still exist in the aspects of detection precision, operation efficiency and complex bridge scene adaptability, so that the large-scale application of the crack detection technology in engineering practice is limited. The current main technical route of automatic bridge crack detection can be divided into a traditional image processing method and a deep learning method. The traditional method takes manual design characteristics as a core, captures crack contours through edge detection operators such as Sobel, canny and the like, combines a threshold segmentation algorithm and morphological operation to realize region extraction, and is widely applied to scenes with simple background and clear crack characteristics. The deep learning method mainly comprises two types, namely a method based on a target detection framework, such as a fast R-CNN and a YOLO series, which are widely applied in a real-time detection scene due to the fact that the speed and the precision are both considered, and a method based on semantic segmentation, such as a U-Net and deep Lab series, which are suitable for complex-form cracks, and pixel level identification is realized through an encoder-decoder structure. The method can automatically learn deep features, reduces manual dependence, but still has the defects that the common model has high crack omission rate on multiple forms, large parameter quantity is difficult to deploy in mobile terminal equipment and the like, and the robustness is to be improved in a complex scene. Although the bridge crack detection technology has advanced, the prior art still faces many challenges when it is converted into an engineering practical high reliability detection scheme, particularly in the following two aspects: first, the model is not sufficiently targeted to bridge fracture characteristics. The network structure and parameter setting of the conventional general target detection model are oriented to general target design, and unique characteristics of thin and thin bridge cracks, irregular morphology and low contrast with the background are not fully considered, so that the capturing capability of a feature extraction module on key information of the cracks is limited, non-crack areas (such as surface scratches and stains) are easily misjudged as the cracks, or the fine cracks are missed to be detected, and the detection precision is difficult to meet engineering level requirements. Second, the model computation complexity does not match the real-time deployment requirements. The existing mainstream deep learning model generally adopts a deeper network structure and a larger feature diagram size for pursuing high precision, so that the parameters are huge, the calculation cost is high, the calculation speed is slow, real-time reasoning is difficult to realize on the portable equipment or the embedded platform of the bridge detection site, and the application of the model in actual engineering is restricted. Disclosure of Invention The invention aims to provide a bridge crack detection method and system based on improvement YOLOv, which solve the problems of insufficient capturing of thin cracks, unbalanced real-time performance and precision and the like in bridge crack detection, integrate ADown, CGLU, EMA, efficientHead four improved modules based on YOLOv, optimize feature processing and detection performance, realize efficient and accurate recognition and positioning of cracks under thin, small-scale and complex background, integrate a visual real-time monitoring platform and meet engineering requirements of bridge real-time inspection. In order to achieve the above purpose, the invention provides a bridge crack detection method based on improvement YOLOv11, comprising the following steps: s1, constructing YOLOv a basic framework, wherein the basic framework comprises a backbone network, a neck network and a detect