Search

CN-121982013-A - Shaving board surface defect identification method and system based on twin network and supervision contrast learning

CN121982013ACN 121982013 ACN121982013 ACN 121982013ACN-121982013-A

Abstract

The invention discloses a shaving board surface defect identification method and system based on twin network and supervised contrast learning, comprising the steps of (1) collecting shaving board surface images, preprocessing, constructing a small sample dataset containing multiple types of defects, dividing the small sample dataset into a training set and a testing set, (2) constructing an improved twin supervised contrast network model, wherein the model comprises a feature extraction main network, LDFPN and a classification contrast learning head, (3) training the twin supervised contrast network model by using the training set, and adopting joint optimization of cross entropy classification loss and supervision contrast loss during training, and (4) carrying out defect identification and classification on the shaving board surface images by using the trained twin supervised contrast network model. The method and the system aim to solve the problems of low accuracy of identifying the surface defects of the chipboard and weak generalization capability of the model under the condition of a small sample, realize high-efficiency and accurate automatic quality detection and have the potential of real-time deployment on a production line.

Inventors

  • LIU YING
  • CHEN JIANSHENG
  • ZHOU HAIYAN
  • XIA HAIFEI
  • HUANG ZHIXIN
  • YANG YUTU
  • Fan Chenlong
  • XI SHUANG

Assignees

  • 南京林业大学

Dates

Publication Date
20260505
Application Date
20260210

Claims (10)

  1. 1. A shaving board surface defect identification method based on twin network and supervision contrast learning is characterized by comprising the following steps: (1) Collecting a shaving board surface image, preprocessing, constructing a small sample data set containing multiple types of defects, and dividing the small sample data set into a training set and a testing set; (2) Constructing an improved twin supervisory control network model, the twin supervisory control network model comprising: feature extraction backbone network, namely adopting a pre-trained visual transducer model ViT-S/14 reg4 DINOV2; The lightweight multi-scale feature fusion pyramid structure LDFPN is used for carrying out multi-scale fusion and enhancement on the output features of the feature extraction backbone network; The classifying and comparing learning head comprises a parallel classifying branch and a projection branch; (3) Training the twin supervision and contrast network model by using the training set, wherein the combined optimization of cross entropy classification loss and supervision and contrast loss is adopted during training; (4) And performing defect identification and classification on the surface image of the shaving board by using the trained twin supervision comparison network model.
  2. 2. The method according to claim 1, wherein the step (1) specifically comprises: collecting a shaving board surface image; carrying out background removal and cutting on the acquired surface image of the shaving board, and dividing the acquired surface image into 512 multiplied by 512 image blocks; constructing a small sample data set containing five defects of glue spots, scratches, sand leakage, pollution and breakage; the data set is randomly divided into a training set and a test set according to the proportion of 8:2.
  3. 3. The method of claim 1, wherein the visual transducer model ViT-S/14 reg4 DINOv2 of step (2) has an input token sequence comprising CLS tokens, patch tokens and 4 register tokens, and wherein the global context modeling is performed by a self-attention mechanism, and wherein the weights of the backbone network are frozen during training.
  4. 4. The method of claim 1, wherein the lightweight multi-scale feature fusion pyramidal structure LDFPN of step (2) comprises: Adopting a light downsampling module based on depth separable convolution to downsample the input feature map step by step to obtain a multi-scale feature map; up-sampling feature graphs with different scales to the same spatial resolution, and fusing through 1X 1 convolution after channel dimension splicing; Adopting an improved coordinate attention mechanism to enhance the fused characteristics; and converting the enhanced feature map into a feature vector with uniform dimension through global average pooling.
  5. 5. The method of claim 4, wherein the improved coordinate attention mechanism is a residual adaptive coordinate attention module RACA comprising: carrying out global average pooling on the input feature images along the height and width directions respectively to obtain a height feature vector and a width feature vector; splicing the feature vectors in two directions in the space dimension, and carrying out channel compression through 1X 1 convolution; dividing the compressed features into height and width components, and generating the attention weights in the height and width directions through a convolution and Sigmoid activation function respectively; the original input features and the features weighted by the attention weights are added by a learnable scaling factor to obtain an enhanced output feature map.
  6. 6. The method of claim 1, wherein the joint loss function in step (3) is: ; Wherein, the Loss weights for contrast; In order to cross-entropy categorize the loss, To monitor contrast loss.
  7. 7. The method of claim 6, wherein the supervisory contrast loss function is defined as: ; where M is the total number of sample views in the batch, To be with the sample Positive sample sets with identical labels; Normalized by L2, and τ is the temperature coefficient.
  8. 8. The method of claim 1, wherein the training configuration of step S3 comprises training 150 epochs with an initial learning rate of 0.0001 and a batch size of 16 using an Adam optimizer.
  9. 9. The method of claim 1, wherein the training and reasoning of the twin supervision comparative network model employs different network architecture modes: (a) In the training phase, the twinning supervision contrast network model operates in a twinning network form: (a1) Generating two random enhancement views for each defect image in the training set to form a positive sample pair; (a2) Respectively inputting the two views into network double branches of shared parameters, and extracting a main network, a lightweight multi-scale feature fusion pyramid structure LDFPN and a classification comparison learning head through the features; (a3) Optimizing through a combined loss function, wherein the classifying branches utilize cross entropy loss to conduct multi-class defect identification supervision, and the characteristics of projection branch output are used for calculating supervision contrast loss so as to pull similar samples and push away heterogeneous samples; (b) In the reasoning stage, the twin supervision and comparison network model is converted into a single-branch multi-classification network: (b1) Forward propagation is carried out on only a single image to be detected; (b2) After extracting features through the feature extraction backbone network and the lightweight multi-scale feature fusion pyramid structure LDFPN, directly outputting each class of score by classifying branches in the classifying and contrasting learning head; (b3) And taking the category with the highest score as a final defect identification result.
  10. 10. A particle board surface defect detection system, comprising, in order: The image acquisition module is used for acquiring an original image of the surface of the shaving board; The preprocessing module is used for removing the background of the original image and cutting the original image into image blocks with set sizes; a defect recognition module for loading and running the twin supervision and comparison network model trained by the method according to any one of claims 1-9, and performing feature extraction and classification on the input image block; and the result output module is used for outputting the defect type identification result and the confidence of the image block.

Description

Shaving board surface defect identification method and system based on twin network and supervision contrast learning Technical Field The invention relates to the technical field of intelligent manufacturing of wood processing and computer vision, in particular to a shaving board surface defect intelligent identification method and system based on twin network and supervised contrast learning for small sample conditions. Background The shaving board is used as the main product of artificial board and is widely used in furniture manufacture, indoor decoration and building components. The surface quality directly affects the aesthetic degree and the service performance of the product. However, due to factors such as heterogeneity of wood raw materials, uneven distribution of adhesives, fluctuation of hot pressing process and the like, the surface of the shaving board is easy to have defects such as glue spots, scratches, sand leakage, pollution, breakage and the like. At present, most domestic manufacturers still rely on manual visual inspection for defect detection and classification. The method has the problems of low efficiency, strong subjectivity, easiness in introducing errors due to visual fatigue and the like, and potential safety hazards exist for personnel to operate beside equipment running at high speed. In recent years, visual detection methods based on deep learning (such as frames based on YOLO, fast R-CNN and the like) are introduced into the field of industrial quality inspection, and good effects are obtained in part of scenes. However, these deep learning methods typically require a large number of labeled samples to achieve the desired performance. In the production of the actual shaving board, as the technological level is improved, the defect occurrence rate is continuously reduced, so that defect samples are rare, the acquisition cost is high, and a typical small sample learning scene is formed. Under the condition, the traditional deep learning method is extremely easy to overfit, and the generalization capability is seriously insufficient. To address the small sample problem, twin Network (Siamese Network) and contrast learning (Contrastive Learning) techniques are of interest. The twin network classifies the similarity between the pairs of the comparison samples, and is naturally applicable to scenes with insufficient samples. The supervised contrast learning (Supervised Contrastive Learning) can use limited label information to explicitly pull similar samples and pull different samples in the feature embedding space, so that the feature representation with stronger discrimination is learned. However, how to combine these advanced learning paradigms with the special visual characteristics of the surface defects of the particle board (such as complex wood grain background, variable defect scale, strong texture interference, etc.), to construct an efficient, robust and easy-to-deploy identification method and system remains a major challenge faced by the current technology. Disclosure of Invention The invention aims to overcome the defects of the prior art and provides a shaving board surface defect identification method and system based on twin network and supervised contrast learning. The method and the system aim to solve the problems of low accuracy of identifying the surface defects of the chipboard and weak generalization capability of the model under the condition of a small sample, realize high-efficiency and accurate automatic quality detection and have the potential of real-time deployment on a production line. In order to achieve the technical purpose, the invention adopts the following technical scheme: a shaving board surface defect identification method based on twin network and supervision contrast learning comprises the following steps: (1) Collecting a shaving board surface image, preprocessing, constructing a small sample data set containing multiple types of defects, and dividing the small sample data set into a training set and a testing set; (2) Constructing an improved twin supervisory control network model, the twin supervisory control network model comprising: feature extraction backbone network, namely adopting a pre-trained visual transducer model ViT-S/14 reg4 DINOV2; The lightweight multi-scale feature fusion pyramid structure LDFPN is used for carrying out multi-scale fusion and enhancement on the output features of the main network; The classifying and comparing learning head comprises a parallel classifying branch and a projection branch; (3) Training the twin supervision and contrast network model by using the training set, wherein the combined optimization of cross entropy classification loss and supervision and contrast loss is adopted during training; (4) And performing defect identification and classification on the surface image of the shaving board by using the trained twin supervision comparison network model. As a further improved technical schem