CN-115294038-B - Defect detection method based on joint optimization and mixed attention feature fusion
Abstract
The invention relates to a two-segment defect detection method based on segmentation-classification, which aims at the problem of insufficient feature extraction capability of a two-segment defect detection algorithm, provides a mixed attention feature fusion module based on the problem of insufficient feature extraction capability of the two-segment defect detection algorithm, and is integrated in a segmentation network of an encoder-decoder structure, so that a model better utilizes global context information and utilizes extracted deep features to reconstruct a pixel-level segmentation map. In addition, the invention provides a multi-receptive field spatial attention module, which utilizes the enhancement receptive field extracted spatial attention weight brought by the cavity convolution to effectively enhance the extraction capability of the model on the micro features. Aiming at the problem of low training efficiency of the two-section type defect detection model, the invention provides a joint optimization framework, and the model is trained end to end by utilizing the constructed joint loss function. Experiments show that the improvement provided by the invention can effectively improve the detection precision of the defect task.
Inventors
- DONG YONGFENG
- Sun Songyi
- WANG ZHEN
- QI QIAOLING
- WANG LIQIN
Assignees
- 河北工业大学
- 河北工业大学
Dates
- Publication Date
- 20260421
- Application Date
- 20220725
- Priority Date
- 20220725
Claims (3)
- 1. A defect detection method based on combination optimization and mixed attention feature fusion is characterized by comprising the following steps: the method comprises the steps of collecting a surface image of a workpiece to be detected, preprocessing the collected image, setting a real label for training, constructing a network model, dividing the model into a dividing network and a classifying network, wherein the dividing network consists of an encoder stage, a decoder and a mixed attention characteristic fusion module, and the classifying network consists of a convolutional network trunk, a multi-receptive field space attention module and a classifier; secondly, inputting the preprocessed image into a model for training, constructing a joint loss function, setting optimization parameters and iteration times, and outputting a result of the model as a pixel level segmentation map of a defect part and a corresponding defect type; Thirdly, saving the trained model weight, and detecting the surface defects of the workpiece by using the model; a mixed attention feature fusion-based defect detection model formed by a segmentation network based on an encoder-decoder structure and a classification network combined with a multi-receptive field spatial attention module, The method comprises the steps that a segmentation network consists of an encoder-decoder trunk and a mixed attention characteristic fusion module, wherein the encoder part comprises a downsampling operation with a continuous step length of 2 for 4 times, a characteristic diagram extracted by each layer of network is input to the mixed attention characteristic fusion module, the characteristic diagram with the same resolution after reconstruction of a decoder structure is spliced and participates in subsequent convolution calculation, a pixel-level segmentation diagram with the same size as an input image is finally output by the decoder, the position and the shape of a defect are indicated, deep characteristics output by the segmentation network encoder stage are used as input of a classification network, and the deep characteristics are weighted by a multi-receptive field spatial attention module and then subjected to convolution operation; a spatial attention module and a mixed attention characteristic fusion module based on multiple receptive fields, The mixed attention feature fusion module combines the two attention of space and channel to construct the mode of the encoder-decoder network shallow layer feature and deep layer feature based on the mixed attention feature fusion, and the encoder inputs the feature Feature map up-sampling operation with decoder lower layer through bilinear interpolation Through element addition, the attention weights of the channel directions are input into a channel attention module, and the weights are respectively matched with 、 The feature map generated after multiplication is used as the input of two multi-receptive field spatial attention modules, the result weighted by the spatial attention is spliced to generate the output feature map of the whole mixed attention feature fusion module, and the calculation of the higher-layer network is continuously participated; the mixed attention feature fusion module consists of a multi-receptive field space attention module and a channel attention module, wherein the dimension of the multi-receptive field space attention module is as follows The input feature diagram X of the system is activated by a nonlinear activation function ReLU after convolution operation under different receptive fields, and then spliced into Is further composed of The convolution compresses the channel number of the feature map to 1, and multiplies the feature map by the original input feature map after being activated by the Sigmoid function to obtain the feature map after attention weighting The channel attention module adopts global maximum pooling operation and The convolution operation of the system extracts the attention of the global channel and the attention of the local channel respectively, converts the extracted parameters into feature weights through a Sigmoid function, and multiplies the feature weights with an input feature map to obtain 。
- 2. The defect detection method based on joint optimization and hybrid attention feature fusion according to claim 1, wherein: Wherein the method comprises the steps of The operation of the splice is indicated and, Expressed in terms of The convolution kernel of the size is based on convolution operations for different sensitivity fields, Representing a ReLU activation function; Wherein the method comprises the steps of The operation is pooled for global maximization.
- 3. The defect detection method based on the combination optimization and the mixed attention feature fusion of claim 1, wherein the third step is mainly characterized by a segmentation-classification two-segment defect detection model optimization method based on the combination optimization, The core of the optimization method is a constructed joint loss function, which has the following form: Wherein the method comprises the steps of And (3) with To balance the balance coefficients of the segmentation loss and the classification loss, The weight factor, which is controlled by the number of iteration rounds, is in the form of the ratio of the current number of iteration rounds to the total number of rounds, A modulation factor for the ease of sample segmentation, In order to be a classifier of the class, For the classification score corresponding to the true category, And epsilon is a regularization coefficient for restraining the discrete degree of the non-target class classification score for the classification score corresponding to the class c.
Description
Defect detection method based on joint optimization and mixed attention feature fusion Technical Field The technical scheme of the invention relates to the fields of deep learning, convolutional neural networks and defect detection, in particular to a defect detection method based on combination optimization and mixed attention feature fusion. Background The defect detection technology is an indispensable important means in the modern industrial production quality control link, the traditional machine vision detection method mainly utilizes different properties of workpiece surface defects to formulate a reasonable imaging scheme, and the manually set characteristics are processed through an image processing algorithm based on machine learning, so that defect information possibly contained on the workpiece surface is extracted, and the method is widely researched and applied in the industrial production field. With the continuous deep research of the deep learning field, a deep neural network model represented by a convolutional neural network (Convolutional Neural Networks, CNN) is widely applied to the defect detection field, and has excellent effects on defect feature extraction and defect classification problems. The defect detection problem can be simply summarized into a classification problem for identifying whether the picture to be detected contains defects, and the research fields of defect positioning, defect segmentation and the like can be extended by combining the requirements on the information of the shape, the type, the position and the like of the defects on the basis of the classification problem. The existing convolutional neural network surface defect detection method based on classification can be briefly summarized into a one-segment method for classifying by using original pictures and a two-segment method combined with segmentation and positioning tasks, and the two methods mainly have the following defects: (1) The method for classifying the original pictures by using the 'one-stage' method is simple in network structure and shallow in network, the feature extraction capability of the network model is slightly insufficient when the network model faces defect types with complex shapes, and therefore classification accuracy is unsatisfactory. The attention mechanism and the feature fusion method are added, so that the feature extraction capability of the convolutional neural network can be effectively improved, and the network is focused on a defect part, and the feature fusion method based on the attention mechanism is added in the convolutional neural network. (2) The two-stage classification model usually adopts a non-end-to-end training mode, namely, firstly training the segmentation branches and saving network parameters, selecting model weights with better segmentation results to load the network and training the classification branches, and the training mode can lead the two stages of the model to be well trained, but also has the problems of low training time efficiency and high calculation resource consumption. The joint optimization method provided by the invention can train the model end to end and improve the classification precision of the model. Disclosure of Invention Aiming at the defects of the prior art, the invention aims to provide a defect detection method based on joint optimization and mixed attention feature fusion, which provides a space attention mechanism based on multiple receptive fields, effectively enhances the feature extraction capability of a model on tiny defects, and constructs a mixed attention feature fusion module by combining a channel attention mechanism to replace a feature fusion mode of long-jump connection under different scales of a traditional self-encoder model, thereby better utilizing global context information. The method adopts a joint optimization framework to perform end-to-end training, which remarkably improves the accuracy of the defect detection task. The technical scheme adopted for solving the technical problem is to provide a defect detection method based on combination optimization and mixed attention feature fusion, which comprises the following steps: The first step is to collect the surface image of the workpiece to be tested, preprocess the collected image, set the real label for training, and construct the network model which is divided into two parts of the dividing network and the classifying network. And secondly, inputting the preprocessed image into a model for training, setting optimization parameters and iteration times, and outputting a result of the model as a pixel level segmentation map of defect information and a corresponding defect type. Thirdly, storing the trained model weight, and detecting the surface defects of the workpiece by using the model. In the first step, the acquired image is normalized to an image size of 512×512 pixels through a downsampling operation, and the original RGB three-channel i