CN-122023414-A - Trademark defect detection method based on improvement YOLOv11 and generated network
Abstract
The invention belongs to the technical field of computer vision, and particularly discloses a trademark defect detection method based on improvement YOLOv and a generation network, which comprises the steps of firstly constructing an MGS-YOLOv n model, and replacing a C3k2 module in a hierarchy corresponding to extracted features P3 and P4 and a C3k2 module in a neck part in a P4 and P5 feature fusion branch with an MG_C3k2 module in a backbone network. In the downsampling links of the features P3 to P4 and the features P4 to P5 in the neck feature fusion process, a CIM_ SPDConv module is adopted to replace common convolution, an improved pix2pix generation network is also constructed, and a synthetic defect image is generated through a residual error generator and an improved loss function constructed based on an adaptive mask so as to expand a training set. The method effectively solves the problems that brand defect samples are scarce and small target features are easy to lose, and meanwhile, the detection precision and the robustness are remarkably improved.
Inventors
- Wan Qimeng
- ZHU ENHAO
- REN JIA
Assignees
- 浙江理工大学
Dates
- Publication Date
- 20260512
- Application Date
- 20260413
Claims (8)
- 1. The trademark defect detection method based on the improvement YOLOv and the generated network is characterized by comprising the following steps: Acquiring trademark images from fabrics, preprocessing, uniformly scaling the image sizes, and inputting an off-line trained MGS-YOLOv11n model to obtain trademark defect types, candidate bounding boxes and category probabilities; The MGS-YOLOv n model is improved based on a YOLOv n network, a C3k2 module in a corresponding level for extracting the features P3 and the features P4 is replaced by an MG_C3k2 module in a backbone network, the C3k2 module in the P4 and P5 feature fusion branches is replaced by an MG_C3k2 module in a neck, and meanwhile, CIM_ SPDConv modules are adopted to replace common convolution in downsampling links of the features P3 to P4 and P4 to P5 in the neck feature fusion process, wherein the MG_C3k2 module comprises three feature branches with different semantic levels of shallow layers, middle layers and deep layers, and the CIM_ SPDConv module comprises a serial SPD-Conv module, a channel interaction module and channel-by-channel multiplication weighting, and a convolution layer, a batch normalization and an activation function which are connected in sequence; the off-line training of the MGS-YOLOv n model adopts a trademark defect data set generated by an improved pix2pix generating network, the improved pix2pix generating network comprises a residual type generator, and the training process adopts a generator loss function based on the self-adaptive mask improvement.
- 2. The trademark defect detection method based on the improvement YOLOv and the generated network according to claim 1, wherein: The operation of the mg_c3k2 module is specifically: Firstly, compressing the number of channels in an input feature through convolution, and then dividing the input feature into two paths in the channel dimension equally, wherein one path of branches is used as a shallow feature The other branch first passes through MPBottleneck _GAM module and convolution layer to generate intermediate feature Then the intermediate features Generating deep features by means of MPBottleneck _gam module and convolution layer Finally, shallow layer characteristics Intermediate features And deep features And splicing in the channel dimension, and using the characteristic diagram after convolution and nonlinear activation as the output of the MG_C3k2 module.
- 3. The trademark defect detection method based on the improvement YOLOv and the generated network according to claim 2, wherein: the MPBottleneck _gam module operates as follows: Input features Features obtained by the multi-scale convolution unit, the convolution layer and the GAM attention mechanism unit are matched with input features Residual connection is carried out and then the residual connection is used as the output of the MPBottleneck _GAM module; The multi-scale convolution unit comprises three parallel convolution branches, and features are input Simultaneously, three convolution branches are passed, and then the outputs of the three convolution branches are spliced in the channel dimension; the residual connection includes a switchable residual connection parameter shortcut.
- 4. The trademark defect detection method based on the improvement YOLOv and the generated network according to claim 3, wherein: The operation of the channel interaction module of the CIM-SPDConv module is specifically that the input features firstly compress the channel number through one layer of convolution, and then restore the channel number to the original number through the other layer of convolution; the formula for the channel-by-channel multiplication weighting of the CIM_ SPDConv module is: Wherein, the The characteristics of the kth channel output by the channel interaction module, Scaling the coefficients for the k-th channel that is learnable.
- 5. The method for detecting trademark defects based on improvement YOLOv and the generated network according to claim 4, wherein: The improvement of the residual generator in the pix2pix generation network comprises taking non-defective artwork a as input, learning only the pixel variations from non-defective artwork a to defective image B to generate residual image R, and deriving a composite defective image by superposition and truncation operations: Wherein, the For the synthesized defect image, clip (-) is a truncation operation representing the constraint of values in the [ -1,1] interval.
- 6. The method for detecting trademark defects based on improvement YOLOv and the generated network according to claim 5, wherein: The generator loss function based on adaptive mask improvement is: Wherein, the The parameters are adjusted for the defective area and, The background area is used to adjust the parameters, The residual error activates the adjustment parameters and, To smooth the adjustment parameters for the total variation, A global antagonism loss function for the antagonism network is generated for the condition, Loss function for defective area: Wherein, the Representing the generated defect image in coordinates The pixel value at which it is located, Representing the pixel value of the true defect image at (i, j), Representing the coordinates generated by the adaptive mask generation algorithm Binary defect masks at; As a background area loss function: Wherein, the Pixel values at (i, j) locations for a defect-free image; activating a loss function for the residual: Wherein, the For a target residual amplitude within the defect area, Average residual amplitude for each pixel within the defect region: Wherein, the Is the residual value at position (i, j); smoothing the loss function for the total variation: Wherein, the Is the number of valid pixels within the mask.
- 7. The method for detecting trademark defects based on the improvement YOLOv and the generated network according to claim 6, wherein: The adaptive mask generation algorithm specifically comprises the following steps: (1) Calculating the average absolute difference of the paired samples in the channel dimension to obtain a difference map D: Wherein C represents the number of channels of the image; (2) Calculating dynamic thresholds for disparity map D : Wherein, the The function of the number of digits is represented, The method comprises the steps of setting an upper limit threshold value for a preset value; (3) Greater than dynamic threshold in difference map D Is marked as an initial defect region M 0 , and is expanded by a convolution kernel to obtain a final defect mask M, and correspondingly, is inverted to obtain a background mask M bg =1-M.
- 8. The method for detecting trademark defects based on the improvement YOLOv and the generated network according to claim 7, wherein: The construction process of the trademark defect dataset used for the offline training of the MGS-YOLOv11n model is as follows: (1) Collecting trademark images on the fabric as a data set I, wherein the trademark images comprise trademark defects and images of normal trademarks, and dividing the trademark images into a training set I, a verification set I and a test set I at random according to a proportion; (2) Performing offline training by using a training set I, a verification set I and a test set I to obtain a trained improved pix2pix generating network, and then generating new pictures with trademark defects by using all normal trademark images of a data set I by using the trained improved pix2pix generating network; (3) And randomly distributing the newly generated defect pictures to the first training set according to a proportion, and constructing the first verification set and the first test set into a trademark defect data set.
Description
Trademark defect detection method based on improvement YOLOv11 and generated network Technical Field The invention belongs to the technical field of computer vision and deep learning, and particularly relates to a trademark defect detection method based on improvement YOLOv and a generated network. Background In the production and circulation links of clothing, labels in the form of printing or ironing marks are usually arranged on the inner side or the outer surface of the clothing and used for marking brands, sizes, places of production and other information. The quality problems of local missing, adhesion, fading and the like of the label character are very easy to occur due to rough fabric texture, fabric fluctuation, uneven coating of ink or mucilage, insufficient transfer printing pressure and the like. The traditional manual detection has the problems of low efficiency, easiness in being influenced by subjective factors and the like. Thus, the detection is gradually performed by using a computer, and YOLOv is used as a new generation single-stage detection network, so that better precision and speed balance on the general target detection is realized. However, when the original YOLOv11 is directly applied to the clothing tag character missing detection, the following prominent problems exist: 1. insufficient characterization capability of small targets and fine-grained defects The letters, numbers and marks on the clothing label are small, only occupy tens of pixels in the whole image, and the defect is often just one stroke or a hole between strokes. Such defects are not only small in size, but also very close in gray scale and shape to the surrounding normal strokes, fabric texture. YOLOv11 the backbone network is dominated by the C3k2 block, whose convolution kernel is dominated by the single scale 3 x 3 and is stacked by simple residuals. The design is suitable for medium-large scale targets in a general scene, but in small target detection, high-frequency small scale information can be gradually smoothed, so that YOLOv models are frequently generated in trademark defect detection, and the conditions of low confidence, inaccurate positioning and even direct omission of detection are given by the models when a certain letter is lacked or partial printing is incomplete. 2. Feature downsampling has insufficient retention of character edges and notch structures To obtain a larger receptive field and reduce the computational effort, the YOLOv model downsamples between different scales using a convolution of step length greater than 1. For a medium-large target, the characteristic diagram can be compressed well, but for small characters with the height of more than ten pixels in the clothing label, the size of the small characters on the shallow characteristic diagram is limited, and the small characters are often compressed to only a few sampling points after multiple step convolutions. The common step convolution directly performs weighted summation in a local window while performing spatial sampling, and does not explicitly consider the geometric correspondence between adjacent sub-pixels. For character edges in diagonal or arc shapes, the sampling mode is easy to cause the edge continuity to be damaged, and the character edges are presented as saw teeth or cracks in a deep feature map, so that small gaps inside strokes are completely submerged after a plurality of times of sampling, and the existence of the gaps is hardly perceived by a network at a high layer. 3. The defect samples are rare, and the synthetic data is difficult to truly reflect the trademark defect distribution In actual clothing production, trademark printing defects belong to low probability events, real samples which are large in scale and cover various missing forms are collected and marked one by one, and the cost is very high. The model offline training is directly performed by using limited defect samples, so that the model is easy to be overfitted to few defect modes, and the recognition capability of the model on unseen defect forms is insufficient. To solve the sample shortage problem, a generated type countermeasure network is generally used to synthesize a band defect image from a non-defective trademark image. However, the defect area generated by directly using the pix2pix model is often not smooth in color and texture with surrounding background, the edge is hard and is not consistent with gradual transition of real printing defects, and the generator can generate pseudo defects or change original textures in the background area, so that the statistical distribution of a synthesized sample is obviously deviated from the real trademark defect distribution, and the study of a detection model is disturbed. In summary, the prior art has shortcomings in both the feature expression layer and the data generation layer, and an improved scheme for enhancing the fine-grained defect sensing capability and constructing a high-qu