CN-121982003-A - Abnormality detection system and method based on distillation training

CN121982003ACN 121982003 ACN121982003 ACN 121982003ACN-121982003-A

Abstract

The application provides an anomaly detection system and method based on distillation training, which belong to the field of artificial intelligence and industrial quality inspection, wherein the anomaly detection system comprises an anomaly detection large model and a real-time anomaly detection model, the anomaly detection large model is used as a teacher model, the real-time anomaly detection model is used as a student model, and the anomaly detection knowledge in the anomaly detection large model is transferred into the real-time anomaly detection model through channel-by-channel knowledge distillation training, so that the detection efficiency is improved while the anomaly detection precision is ensured. The application realizes the zero sample detection capability of flaw detection items, can greatly relieve the influence of the problems of long collection period of flaw samples of high-precision workpieces, insufficient robustness of long tail sample detection and the like on an industrial quality inspection solution, and is convenient to popularize and apply in the field of industrial quality inspection.

Inventors

QIAN ZHIMING
GAO YANG
YANG PENG
WANG ZHIWEI

Assignees

苏州天准科技股份有限公司
苏州天准软件有限公司

Dates

Publication Date: 20260505
Application Date: 20260128

Claims (10)

1. An anomaly detection system based on distillation training, comprising: The anomaly detection large model comprises a text encoder, an image encoder sharing weights, an image decoder, a vector acquisition module, an aggregation module and an inference judgment module; the real-time abnormality detection model comprises a small model backbone network and a plurality of sampling modules connected in sequence; and the large abnormality detection model is used as a teacher model, the real-time abnormality detection model is used as a student model, and the abnormality detection knowledge in the large abnormality detection model is migrated into the real-time abnormality detection model through channel-by-channel knowledge distillation training.
2. The anomaly detection system of claim 1, wherein in the anomaly detection large model, the text encoder receives a text description of the defect, the image encoder sharing the weights receives the anomaly map and the good map, and the image decoder uses a lightweight decoder and outputs an anomaly map result.
3. The anomaly detection system of claim 1, wherein the vector acquisition module obtains four types of vector features, wherein the four types of vector features are added based on coding features of the anomaly map and the good map and input to a linear layer to obtain a feature vector V1, a learning layer is prompted to give a learnable feature vector V2, an image decoder outputs features to obtain a feature vector V3 through convolution, pooling and full-connection layers, and an image encoder obtains a feature vector V4 through convolution, pooling and full-connection layers.
4. The anomaly detection system of claim 3, wherein the aggregation module is configured to aggregate the feature vector V1, the feature vector V2, the feature vector V3, and the feature vector V4.
5. The anomaly detection system of claim 4, wherein the inference decision module performs inference decision on the aggregated feature vector data based on LLaVA-Llama model to obtain a large model training or inference result.
6. The anomaly detection system of claim 4, wherein the training loss function of the anomaly detection large model includes a pixel-level error loss function for anomaly detection and a predictive loss function for LLaVA large model training.
7. The anomaly detection system of claim 1, wherein the real-time anomaly detection model uses a YOLOv small model backbone network and the sampling modules are three lightweight up-sampling modules.
8. The anomaly detection system of claim 7, wherein the lightweight upsampling module comprises a convolution layer, a ReLU activation layer, and a MaxPooling pooling layer.
9. An anomaly detection method based on distillation training is characterized by comprising the following steps: s1, training an abnormality detection large model by adopting a strategy based on prompt learning fine adjustment; S2, distillation training, namely taking the large abnormal detection model as a teacher model, taking the real-time abnormal detection model as a student model, and carrying out model training by adopting a training method of channel-by-channel knowledge distillation; S3, independently using the real-time anomaly detection model after distillation training for model reasoning; s4, carrying out anomaly detection on the product to be detected based on the real-time anomaly detection model.
10. The anomaly detection method of claim 9, wherein the distillation training uses a training method based on channel-by-channel knowledge distillation to align the second and third level features of the real-time anomaly detection model.

Description

Abnormality detection system and method based on distillation training Technical Field The invention belongs to the field of artificial intelligence and industrial quality inspection, and particularly relates to an anomaly detection system and method based on distillation training. Background In the field of intelligent manufacturing, industrial quality inspection is an important link in the production process, and a scheme based on supervised learning for segmentation, classification and target detection is a main stream scheme of the current intelligent industrial quality inspection, but the defect rate in an actual production scene is lower than five thousandths, defects are difficult to collect, the data volume is seriously insufficient, and the problems of long project period, unsatisfactory detection precision, overhigh cost and the like are caused. However, for the detection requirement of high-precision workpieces, the defect types are not required to be subdivided, but the problems of long defect sample collection period, insufficient detection robustness of long tail samples (abnormal samples with low occurrence frequency) and the like still exist. Disclosure of Invention In order to overcome the defects in the prior art, the invention aims to provide an abnormality detection system and method based on distillation training, which can solve the problems. The design principle is that the anomaly detection method fully utilizes the similarity of good images, and the open-set problem of target detection and identification is converted into the closed-set problem of standard product similarity clustering. The anomaly detection has the advantages that flaw samples and labeling information are not needed, unknown defects can be effectively detected, and the like, so that the anomaly detection becomes an emerging technical direction in industrial quality inspection, and main flow anomaly detection methods comprise FastFlow, EFFICIENTAD, and the like. Therefore, the applicant designs a novel efficient anomaly detection method and applies the method to the field of industrial quality inspection for solving the problems. The overall design is as follows. The anomaly detection system based on distillation training comprises an anomaly detection large model, a real-time anomaly detection model and a real-time anomaly detection model, wherein the anomaly detection large model comprises a text encoder, an image decoder, a vector acquisition module, an aggregation module and an inference judgment module, the real-time anomaly detection model comprises a small model backbone network and a plurality of sampling modules connected in sequence, the anomaly detection large model is used as a teacher model, the real-time anomaly detection model is used as a student model, and anomaly detection knowledge in the anomaly detection large model is migrated into the real-time anomaly detection model through channel-by-channel knowledge distillation training, so that the anomaly detection accuracy is guaranteed, and meanwhile, the detection efficiency is improved. Further, in the anomaly detection large model, a text encoder receives text description of defects, an image encoder sharing weights receives an anomaly graph and a good graph, and an image decoder adopts a lightweight decoder and outputs an anomaly graph result. Furthermore, the vector acquisition module obtains four types of vector features, namely, the feature vector V1 is obtained by adding the coding features based on the abnormal image and the good image and then inputting the coding features into a linear layer, the learning layer is prompted to give a learnable feature vector V2, the feature vector V3 is obtained by the output features of the image decoder through convolution, pooling and full connection layers, and the feature vector V4 is obtained by the image encoder through convolution, pooling and full connection layers. Further, the aggregation module is configured to aggregate the feature vector V1, the feature vector V2, the feature vector V3, and the feature vector V4. Further, the reasoning judging module conducts reasoning judgment on the aggregated feature vector data based on LLaVA-Llama models to obtain large model training or reasoning results. Further, the training loss function of the anomaly detection large model includes a pixel level error loss function for anomaly detection and a predictive loss function for LLaVA large model training. Furthermore, the real-time anomaly detection model adopts a YOLOv small model backbone network, and the sampling modules are three lightweight up-sampling modules. Further, the lightweight upsampling module includes a convolution layer, a ReLU activation layer, and MaxPooling pooling layers. The application further provides an anomaly detection method based on distillation training, which comprises the steps of S1, training an anomaly detection large model by adopting a strategy based on prompt learning fi