CN-121982045-A - Bridge crack image segmentation system and method thereof

CN121982045ACN 121982045 ACN121982045 ACN 121982045ACN-121982045-A

Abstract

The invention discloses a bridge crack image segmentation system and a method thereof, and belongs to the technical field of computer vision and image processing. The system comprises an image acquisition module, a data preprocessing module, a segmentation network model, a model training module and a segmentation executing module. The segmentation network model is an SA-UNet model, and is innovated in that an SE channel attention mechanism module is embedded in each layer of a feature encoder to strengthen key channel features, and an IIA interaction space attention module is introduced into a jump connection path of the encoder and a decoder to realize self-adaptive fusion of multi-scale features. In addition, the invention designs a targeted double weighted focus loss function, introduces a dynamic false positive punishment mechanism while solving the class imbalance, and effectively inhibits false detection. According to the method, through processing the bridge image acquired by the unmanned aerial vehicle, pixel-level accurate segmentation of the tiny cracks under the complex background can be realized, and reliable technical support is provided for bridge structure health monitoring.

Inventors

NING BINQUAN
SUN DONGFENG
ZHUANG CHENG
XING ZHENZHEN
CHEN HAO
SI WEIYUAN
JIANG NAN
MA LIANG
NIE XUE

Assignees

吉林省高速公路集团试验检测有限公司

Dates

Publication Date: 20260505
Application Date: 20260409

Claims (8)

1. A bridge crack image segmentation system, comprising: the image acquisition module is used for acquiring an image to be segmented of the bridge surface; the data preprocessing module is used for preprocessing the image to be segmented or the training image; The method comprises the steps of dividing a network model into an SA-UNet model, wherein the SA-UNet model comprises a feature encoder, a feature decoder and a jump connection path connected between corresponding levels of the feature encoder and the feature decoder, each layer of the feature encoder comprises an SE channel attention mechanism module for carrying out channel dimension self-adaptive recalibration on an input feature map, and an IIA interaction space attention module is arranged on the jump connection path and used for realizing self-adaptive fusion of encoder features and decoder features in space dimension; the model training module is used for training the segmentation network model by using an image data set containing bridge crack labels and taking a double weighted focus loss function as an optimization target to obtain a trained model weight; And the segmentation execution module is used for loading the trained model weight, carrying out pixel-level segmentation on the image to be segmented processed by the data preprocessing module through the segmentation network model, and outputting a crack segmentation mask.
2. The bridge crack image segmentation system according to claim 1, wherein the data preprocessing module comprises a data construction sub-module for constructing a training data set, and specifically comprises the steps of screening an original image acquired by the image acquisition module, removing an image with unqualified quality, marking a crack area of the screened image to generate a semantic segmentation marking file, and dividing the marked data set into a training set, a verification set and a test set in proportion.
3. The bridge fracture image segmentation system according to claim 2, wherein the data preprocessing module further comprises a data enhancer module for enhancing the training data set, the enhancing operation comprising at least one of a brightness adjustment, an adjustment factor range set to [0.65,1.35], a rotation operation, a rotation angle range set to [ -90 °,90 ° ], and a flipping operation comprising random flipping in horizontal and vertical directions.
4. The bridge-crack image segmentation system according to claim 1, wherein the feature encoder comprises a four-layer structure, each layer sequentially comprises two convolution layers, the SE channel attention mechanism module and one downsampling layer, the feature decoder comprises a four-layer structure, each layer sequentially comprises an upsampling layer and two convolution layers, and the jump connection path fuses the output feature graphs of each layer of the encoder before upsampling the corresponding decoder layer through the IIA interaction space attention module.
5. The bridge crack image segmentation system according to claim 4, wherein the SE channel attention mechanism module comprises a compression unit, an excitation unit and a recalibration unit, wherein the compression unit is used for carrying out global average pooling on an input feature map to generate channel statistics descriptors, the excitation unit comprises two full-connection layers and is used for carrying out nonlinear transformation on the channel statistics descriptors to learn inter-channel correlations and generate weight coefficients of all channels, and the recalibration unit is used for multiplying the weight coefficients with the original input feature map channel by channel to output an enhanced feature map.
6. The bridge crack image segmentation system according to claim 4, wherein the IIA interaction space attention module comprises a feature reconstruction unit, a space weight generation unit and a feature weighting and fusion unit, wherein the feature reconstruction unit is used for reconstructing an input feature image along a height dimension and a width dimension respectively to obtain feature representations of two orthogonal views, the space weight generation unit is used for generating corresponding space attention weight images through parallel pooling, convolution and activation operations on the feature representations of the two orthogonal views respectively, and the feature weighting and fusion unit is used for multiplying the space attention weight images with corresponding reconstructed features respectively, restoring the weighted features and adding the weighted features with original input features to output a fused feature image.
7. The bridge crack image segmentation system according to claim 1, wherein the doubly weighted focus loss function L is defined by the following formula: L=-W t *(1-P t )^γ*log(P t )*F P Wherein W t is a class weight for balancing the loss contribution of the crack and background pixels, (1-P t ) ≡gamma is a standard focus loss factor for reducing the weight of the sample easy to classify, F P is a false positive penalty factor; The false positive penalty factor F P is defined as follows: F P =1+(p crack )^{r fp }*(W crack /W background when the real tag t is background); When the real tag t is a crack, F P =1; Wherein p crack is the confidence that the model predicts that the pixel is a crack, r fp is a penalty index greater than 0, and W crack and W background are the class weights of the crack and the background, respectively.
8. A bridge fracture image segmentation method, characterized in that it is applied to the bridge fracture image segmentation system according to any one of claims 1 to 7, the method comprising the steps of: Acquiring a bridge surface image, and constructing and preprocessing to obtain an enhanced bridge crack semantic segmentation data set; Building an SA-UNet segmentation network model, wherein SE channel attention mechanism modules are embedded in each layer of a feature encoder, and IIA interaction space attention modules are introduced in jump connection paths of the encoder and a decoder; Monitoring the training of the segmentation network model by using the enhanced data set through a double weighted focus loss function to obtain a training optimal weight; and forward reasoning is carried out on the new bridge surface image by utilizing the training optimal weight, so that pixel-level crack segmentation is realized and a result is output.

Description

Bridge crack image segmentation system and method thereof Technical Field The invention relates to the technical field of computer vision and artificial intelligence, in particular to an image segmentation system and method, which are particularly suitable for precisely segmenting a crack target from a bridge surface image with a complex background. Background The bridge is used as a key traffic infrastructure, and the structural health of the bridge is directly related to public safety. Cracks are one of the most common disease forms of bridges, and are important for timely and accurate detection of the cracks. Traditional manual detection methods are low in efficiency, high in subjectivity and safe in risk. With the development of technology, an image-based automatic detection method becomes a research hotspot. Early automatic detection methods relied mainly on traditional image processing techniques such as thresholding, edge detection, etc. The methods generally need to manually set parameters, are sensitive to environmental illumination and background texture changes, have poor robustness, and particularly have poor segmentation effect on small cracks with low contrast and complex morphology. In recent years, semantic segmentation technology based on deep learning, in particular U-Net and variants thereof, has been remarkably successful in medical image and remote sensing image segmentation, and has been introduced into the field of bridge crack detection. The method can automatically learn the characteristics, and improves the detection accuracy to a certain extent. For example, chinese patent document (publication No. 2022, 10, 14) with CN115187621a discloses an automatic extraction network for U-Net medical image profile with a fused attention mechanism, and US patent document (publication No. 2022, 9, 29) with US-20220309674-A1 discloses a medical image segmentation method based on U-Net. However, bridge crack detection faces unique challenges of extremely small ratio of crack pixels in the whole image (extremely unbalanced category), fine and irregular crack morphology, and is often confused with background stains and textures, and the image acquisition is greatly influenced by illumination and visual angles. When the universal segmentation network is directly applied to the scene, the problems of high omission rate of fine cracks and more over-detection (false positive) of complex backgrounds are often caused, and engineering precision requirements are difficult to meet. In the prior art, although research attempts are made to introduce attention mechanisms in U-Net to improve feature selectivity, attention (such as channel attention or spatial attention) with multiple points in a single dimension is not fully utilized in cooperation with feature information with different dimensions. For example, china patent publication No. CN120276031B (day 2025, month 4, and 17) discloses a seismic data interpolation method based on CBAM-Res2Unet network, which introduces CBAM module in jump connection to realize channel and space attention, and Zilong Huang et al, "CCNet: criss-Cross Attention for Semantic Segmentation" published on IEEE Transactions on Pattern Analysis and Machine Intelligence（Vol.45, No.6, pp.6896-6908, June 2023） proposes a crisscross attention module for semantic segmentation. The methods are not specially optimized for the key difficult problem of 'suppressing the erroneous judgment of the background as the crack' in the design of the loss function. Therefore, a special segmentation scheme capable of deeply fusing multidimensional attention and having a targeted optimization target is needed to realize high-precision and high-robustness segmentation of the bridge fracture. Disclosure of Invention Aiming at the core pain points with high fine crack omission ratio, more false positive false detection and insufficient multi-dimensional characteristic cooperative utilization in the existing bridge crack segmentation technology, the invention provides a high-precision image segmentation scheme suitable for a bridge structure health monitoring scene, and the pixel-level accurate segmentation of the bridge crack in the complex background is realized through the cooperative optimization of a two-dimensional attention mechanism and the design of a targeted loss function. In order to achieve the above purpose, the core technical concept adopted by the invention is as follows: At the network architecture level, an SA-UNet segmentation model based on U-Net improvement is constructed, and a collaborative feature extraction link of 'channel layer-by-layer screening-space self-adaptive alignment' is formed. On one hand, an SE channel attention mechanism module (hereinafter referred to as SE module) is embedded in each level of the feature encoder, key channel features related to cracks are reinforced layer by layer in the process of feature progressive extraction, invalid information transmission of b