CN-122023361-A - Self-supervision industrial anomaly detection method based on single-step denoising diffusion model

CN122023361ACN 122023361 ACN122023361 ACN 122023361ACN-122023361-A

Abstract

The invention relates to a self-supervision industrial anomaly detection method based on a single-step denoising diffusion model, which comprises the steps of obtaining an industrial product surface image, preprocessing, dividing a training set and a testing set, carrying out multi-scale anomaly synthesis on the training set to generate a simulated defect sample, carrying out self-supervision training on a two-step denoising reconstruction sub-network according to the simulated defect sample, obtaining a trained two-step denoising reconstruction sub-network, inputting the trained two-step denoising reconstruction sub-network into the industrial product surface image to be detected to obtain a defect-free reconstruction image, carrying out feature fusion on the sample in the testing set and the defect-free reconstruction image, generating a score graph by utilizing a defect segmentation sub-network, carrying out Top-K aggregation and threshold judgment on the score graph, and outputting an anomaly judgment result and a defect positioning graph. The invention realizes high-precision positioning of the micro defects and high-fidelity reconstruction of normal textures while obviously improving the detection speed.

Inventors

WAN JIHONG
HONG JIANJIE
LI MIN
WANG BAIHAN
LI XIAOPING

Assignees

广东工业大学

Dates

Publication Date: 20260512
Application Date: 20260203

Claims (8)

1. The self-supervision industrial anomaly detection method based on the single-step denoising diffusion model is characterized by comprising the following steps of: Step 1, acquiring an industrial product surface image and preprocessing, and dividing a training set and a testing set, wherein the training set consists of normal samples without defects, and the testing set consists of normal samples without defects and samples with various real defects; Step 2, performing multi-scale abnormal synthesis on the training set to generate a simulated defect sample; Step 3, performing self-supervision training on a two-stage single-step denoising reconstruction sub-network according to the simulated defect sample, acquiring the trained two-stage single-step denoising reconstruction sub-network, inputting the surface image of the industrial product to be detected into the trained two-stage single-step denoising reconstruction sub-network, and acquiring a defect-free reconstruction image, wherein the two-stage single-step denoising reconstruction sub-network is constructed based on a diffusion model, and the two-stage single-step denoising reconstruction sub-network eliminates abnormal interference in the image by performing a progressive single-step denoising method in a potential space; Step 4, carrying out feature fusion on the samples in the test set and the non-defect reconstructed image, extracting nonlinear residual features by using a defect segmentation sub-network, and generating a score graph; And step 5, executing a Top-K aggregation strategy and threshold judgment on the score map, and outputting an abnormality judgment result and a defect positioning map.
2. The method for self-supervised industrial anomaly detection based on a single-step denoising diffusion model of claim 1, wherein performing multi-scale anomaly synthesis on the training set, generating simulated defect samples comprises: Obtaining a structural morphology mask by using program noise and nonlinear geometric transformation; And embedding abnormal texture features into the region defined by the structural morphology mask by a dynamic weighted fusion algorithm for the samples in the training set to generate the simulated defect samples.
3. The self-supervised industrial anomaly detection method based on a single step denoising diffusion model of claim 1, wherein the generating the simulated defect sample S method is: ; Wherein I is the current normal input diagram, M is the comprehensive mask, the product of Hadamard is shown by the letter, beta E [0.1,1.0] is a dynamically adjusted fusion coefficient, Is the pixel-by-pixel inverse of mask M.
4. The method for self-monitoring industrial anomaly detection based on a single-step denoising diffusion model according to claim 1, wherein self-monitoring training of a two-stage single-step denoising reconstruction subnetwork according to the simulated defect sample comprises: performing preliminary denoising on the simulated defect sample under a low noise level to obtain a preliminary repair characteristic; Under the high noise level, the characteristics of the preliminary repair are guided as conditions, and reconstructed potential spatial characteristics are obtained; And inputting the reconstructed potential spatial features into a pre-trained potential diffusion model decoder to obtain a reconstructed image.
5. The self-supervision industrial anomaly detection method based on the single-step denoising diffusion model of claim 4, the method is characterized in that the preliminary denoising of the simulated defect sample under the low noise level comprises the following steps: compressing the simulated defect sample through a pre-trained variational self-encoder to obtain a compressed characteristic representation; Under the low noise level, forward noise adding is carried out on the compressed characteristic representation, and low noise characteristics are obtained; And carrying out noise adding component prediction on the low noise characteristic by adopting a U-Net based on a residual convolution module to obtain the characteristic of preliminary restoration.
6. The method for self-supervised industrial anomaly detection based on a single-step denoising diffusion model of claim 5, wherein taking the features of the preliminary repair as conditional guidance, obtaining reconstructed potential spatial features comprises: Under the high noise level, the compressed characteristic representation is subjected to noise adding, and high noise characteristics are obtained; and taking the features of the preliminary repair as semantic anchor points, inputting the semantic anchor points into U-Net of an embedded attention mechanism in a cross-attention layer or channel splicing mode, filling the high-noise features, and obtaining the reconstructed potential spatial features.
7. The method of claim 1, wherein feature fusion of the samples in the test set with the non-defective reconstructed image, extracting non-linear residual features using a defect segmentation sub-network and generating a score map comprises: Splicing the samples in the test set and RGB of the non-defective reconstructed image in a channel dimension to generate a joint vector; Inputting the joint vector into a defect segmentation sub-network, extracting nonlinear residual characteristics by using a multi-layer residual convolution network with a jump connection structure, judging an abnormal region, and outputting the score by using a Sigmoid activation function, wherein the defect segmentation sub-network is obtained by optimizing a composite Loss function of Focal Loss and Smooth L1 Loss, and the numerical value of each coordinate point in the score is the confidence level of the deviation of pixels from normal data distribution.
8. The self-supervised industrial anomaly detection method based on a single step denoising diffusion model of claim 1, wherein performing Top-K aggregation policy and threshold decision on the score map, outputting anomaly decision results and defect localization maps comprises: selecting a group of local peaks with strongest response from the score map through the Top-K aggregation strategy to perform mean value calculation, and taking the local peaks as abnormal scores; Comparing the abnormal score with a preset decision threshold, outputting a final normal/abnormal decision conclusion, and simultaneously converting the score map into a pseudo-color thermodynamic diagram to realize visual positioning of the abnormal position.

Description

Self-supervision industrial anomaly detection method based on single-step denoising diffusion model Technical Field The invention relates to the technical field of industrial anomaly detection, in particular to a self-supervision industrial anomaly detection method based on a single-step denoising diffusion model. Background The detection of the surface defects of industrial products is a key link in the field of modern intelligent manufacturing, and has important significance for ensuring the quality of the products and reducing the production cost. Because abnormal samples in industrial production have extremely high scarcity, diversity and unpredictability, normal samples are utilized for distributed modeling through an unsupervised learning strategy, and further, abnormal areas are identified and positioned, so that the method becomes a research hotspot in the current academia and industry. The existing unsupervised anomaly detection technology is mainly divided into a feature embedded type and an image reconstruction type. Wherein, the image reconstruction-based method (such as an automatic encoder AE-based method or a method for generating an countermeasure network GAN-based method) assumes that the model only has the capability of repairing normal distribution, and the anomaly is determined by comparing residuals between the original input map and the reconstructed map. However, in practical applications, such methods suffer from the significant drawbacks of firstly, the limitation of the ability to reconstruct the network generation. The traditional architecture often causes high-frequency detail loss (such as fine texture blurring) of a normal region in the reconstruction process, so that extremely high false detection residual errors are generated in the normal region, and secondly, the phenomenon of overgeneralization in the reconstruction process is generated. The reconstructed model tends to unexpectedly repair the defective area, making it visually tend to be normal, resulting in a high omission rate. In recent years, diffusion models have been tried to be introduced into industrial image reconstruction tasks by virtue of their excellent distribution fitting and high quality image synthesis capabilities. Diffusion models can theoretically generate reconstructed images with higher fidelity than traditional architectures by introducing multi-step gaussian noise into the image and learning the inverse denoising process. However, when it is converted into a mature industrial scheme, the following core contradiction to be solved still exists: First, the diffusion model reasoning mechanism contradicts the real-time requirements. Diffusion models typically rely on multiple iterations of a markov chain to perform inverse denoising (e.g., DDPM or AnoDDPM), often requiring hundreds or even thousands of model forward inferences to complete an image reconstruction. This extremely low inference efficiency results in a huge computational overhead and a complete inability to meet real-time on-line detection standards for high-speed production lines (which typically require millisecond-scale responses). Second, gaming between anomalous erasure intensities and detail retention fidelity. In the diffusion model reconstruction process, the step number of the injected noise is a key parameter affecting the performance. If the number of steps of the injected noise is small, the model can keep the global structure of the normal region, but semantic information of obvious defects cannot be thoroughly destroyed, so that defect characteristics still remain in the reconstructed image, and if the number of steps of the injected noise is large, the model can thoroughly erase the defects, but also has larger structural deviation (such as shape deviation and texture dislocation) between the reconstructed normal region and the original image due to the fact that the constraint of the original image is lost, so that serious false detection is generated. The existing diffusion reconstruction scheme lacks a cooperative mechanism which can not only ensure thorough erasure of abnormal semantics, but also restore details of normal areas with high fidelity. Finally, the problems of authenticity and domain deviation of the synthesized samples in the self-supervision training strategy are solved. Existing self-supervised anomaly detection algorithms typically utilize simple noise smearing or geometric deformation to construct the false defects during the training phase. However, this simple synthesis strategy and the actual industrial defects (such as scratches, corrosion, deletions, etc.) have obvious "domain gaps" in morphology distribution and texture characteristics, resulting in poor segmentation accuracy and robustness when the model is handling complex, weak defects in the actual production environment. In summary, how to construct an unsupervised anomaly detection scheme which not only can give consideration to high fidelity