CN-122023816-A - Noise label robust training method and device for image recognition model

CN122023816ACN 122023816 ACN122023816 ACN 122023816ACN-122023816-A

Abstract

The invention discloses a noise label robust training method and device for an image recognition model, which belong to the technical field of image recognition model training, and comprise the steps of inputting an image training sample into a deep neural network for feature extraction; the method comprises the steps of constructing a multi-granularity pellet structure in a feature space based on feature vectors, performing layering correction on labels of image training samples, performing intra-pellet label propagation, calculating propagation confidence distribution and consistency scores of each image sample, screening clean sample subsets according to the consistency scores, and performing iterative training on a deep neural network to update network parameters. According to the invention, through a self-adaptive multi-granularity sphere dividing mechanism, the local structural characteristics of data can be fully described in a characteristic space, the problems of excessive fragmentation or insufficient purity generated by the traditional fixed granularity clustering under a high noise environment are avoided, and a stable and reliable structure priori is provided for subsequent label correction and information transmission.

Inventors

LIU QIANG
YANG JUNXIAO
ZHU GUANGQIAN
LIU LUPING

Assignees

成都信息工程大学

Dates

Publication Date: 20260512
Application Date: 20260408

Claims (10)

1. A noise label robust training method for an image recognition model, comprising the steps of: s1, acquiring an image training sample containing a noise label; s2, inputting the image training sample into a deep neural network for feature extraction to obtain feature vectors and corresponding category prediction probabilities of the image in a feature space; S3, constructing a multi-granularity pellet structure in a feature space based on the feature vector, and dividing pellets by adopting a self-adaptive multi-granularity pellet dividing strategy based on pellet purity; s4, performing layering correction on the labels of the image training samples by combining the model prediction confidence, the pellet purity and the pellet structure information; S5, carrying out intra-pellet label propagation by utilizing structural characteristics of intra-pellet image training samples and correction labels, and calculating propagation confidence distribution and consistency scores of each image sample; and S6, screening out a clean sample subset according to the consistency score, and performing iterative training on the deep neural network by using the clean sample subset and the corrected label so as to update network parameters.
2. The noise label robust training method for an image recognition model according to claim 1, wherein the S2 specifically comprises: Inputting the image training samples into a deep neural network, mapping the input image training samples into a feature space through a feature extraction network to obtain feature vectors of each image training sample, carrying out normalization processing on the feature vectors through norms to form a feature library, and obtaining class prediction probability of the corresponding feature vectors through a classification head and a softmax function.
3. The noise label robust training method for an image recognition model according to claim 2, wherein the S3 specifically comprises: obtaining a plurality of pellets on the feature library by using a K-means method, and calculating the maximum recursion division times of each pellet or the pellets generated by segmentation; if the current dividing times do not exceed the maximum recursion dividing times and the pellet purity is lower than the preset purity threshold, the internal characteristics of the pellets are divided into two pellets by using K-means division, otherwise, the division of the pellets is stopped.
4. The noise label robust training method for image recognition model according to claim 3, wherein in S3, a maximum number of recursive divisions is calculated, expressed as: In the formula, For the maximum number of recursive divisions, For a globally set maximum upper iteration limit, Representing a downward rounding; As pellet purity, expressed as: In the formula, The category is indicated as such, Represent the first The index of the sample in the individual pellet, Is the first A sample class label of a number, As an indication function.
5. The noise label robust training method for image recognition models of claim 4, wherein in S4, the hierarchical modification includes fine granularity modification and coarse granularity modification; Wherein, fine granularity correction is executed firstly, which is specifically as follows: In the formula, In order to refine the label in terms of the fine, For the sample Is used to determine the maximum prediction confidence of the model, Correcting a threshold value for fine granularity; for category Is used to determine the prediction probability of (1), Is the first Labels of the individual samples; if the maximum class prediction probability of the image training sample exceeds a preset fine granularity correction threshold value Updating the label of the image training sample to the category with the maximum prediction probability, and further obtaining a fine-granularity correction label; And then performing coarse granularity correction, namely, for an image training sample which does not meet the fine granularity correction condition, if the purity of the pellet to which the image training sample belongs is greater than a preset purity threshold value, updating the label of the image training sample into a weighted majority label in the pellet, otherwise, keeping the label unchanged, wherein the method is specifically expressed as: In the formula, In order to modify the label, To weight a majority label; As a threshold value for the purity level, Average confidence for all samples.
6. The noise label robust training method for image recognition models of claim 5, wherein in said S4, a majority label is weighted It is expressed as: In the formula, Is a pellet ball Index in middle is Is the sample pair of (2) The original prediction scores of the individual categories are, Is the total number of samples.
7. The noise label robust training method for image recognition models according to claim 1, wherein said S5 specifically comprises the sub-steps of: S51, for each category For having correction label The similarity matrix of the shot spheres of the image training samples of (1) is aggregated to obtain each image training sample pair of each category Propagation contribution of score: In the formula, Indicating pellet Middle (f) Individual element pair category The propagation contribution of the score; Is a pellet ball The middle label is of the category Is used for the sample index of (a), Representation for each category Pellet ball With correction labels therein Is a subset of the sample index of (a); Is a pellet ball Inner first Sample and the first The degree of similarity between the individual samples, Is a pellet ball Inner first Propagation weights for the individual samples; s52, carrying out normalization processing on the propagation contribution of each sample on all categories to obtain a propagation confidence distribution vector of the sample: In the formula, Representing a sample Propagation probability of (a); In order to categorize the total number of categories of tasks, Represent the first The original score vector of the individual samples, Is a constant; and S53, calculating a consistency score according to the transmission confidence distribution vector.
8. The noise label robust training method for image recognition models of claim 7, wherein in S53, a consistency score is calculated from the propagation confidence distribution vector, expressed as: Wherein: In the formula, In order to score correctly the score of the score, To be corrected label Is used to determine the voting value of (1), For the maximum propagation score in category C, For the sample Is a correction label of (a) The probability values in the propagation probability distribution, For the sample Belonging to the first Post-propagation probability values for classes.
9. The method for robust training of noise labels for image recognition models of claim 1, wherein in S6, the filtering out a subset of clean samples based on the consistency score comprises if a correct score is If the score is larger than or equal to the scoring threshold, the propagation result is consistent with the correction label, the sample is a clean sample, and if the score is correct If the value is smaller than the scoring threshold, the propagation result is deviated from the correction label, and the sample is a label containing noise, namely: In the formula, To include an index that is clean in samples and suitable for training, To include an index of samples marked as noise, Is a scoring threshold.
10. A noise tag robust training apparatus for an image recognition model, comprising: the image acquisition module is used for acquiring an image training sample containing a noise label; The feature extraction module is used for inputting the image training sample into the deep neural network to perform feature extraction to obtain feature vectors and corresponding category prediction probabilities of the image in a feature space; the pellet construction module is used for constructing a multi-granularity pellet structure in a feature space based on the feature vector and dividing pellets by adopting a self-adaptive multi-granularity pellet dividing strategy based on pellet purity; the layering correction module is used for performing layering correction on the labels of the image training samples by combining the model prediction confidence, the pellet purity and the pellet structure information; The pellet label propagation module is used for performing pellet label propagation by utilizing the structural characteristics of the pellet internal image training samples and the correction labels, and calculating propagation confidence distribution and consistency scores of each image sample; And the model updating module is used for screening a clean sample subset according to the consistency score, and carrying out iterative training on the deep neural network by utilizing the clean sample subset and the corrected label so as to update network parameters.

Description

Noise label robust training method and device for image recognition model Technical Field The invention belongs to the technical field of image recognition model training, and particularly relates to a noise label robust training method and device for an image recognition model. Background In terms of image recognition, with the development of deep learning technology, a deep neural network model (Deep neural networks, DNN) based on supervised learning has made remarkable progress in tasks such as image classification and target recognition. However, supervised learning approaches are highly dependent on large-scale, high quality, manually labeled data. In practical application, because of high manual labeling cost, subjective difference of labeling personnel, errors in an automatic acquisition process and ambiguity of a sample in a complex scene, the training data often inevitably contains noise labels, namely observation labels of the sample are inconsistent with real labels of the sample. The DNN has stronger fitting capability, and along with the advancement of the training process, the model gradually memorizes noise labels, so that the model is subjected to fitting error supervision signals, and generalization performance and robustness are reduced. Therefore, learning under noise tags (LEARNING WITH noise labers, LNL) has become a critical area of research, critical to the exploitation of large-scale, imperfect data in real-world applications. A mainstream LNL approach is sample selection and re-labeling (Sample Selection and Relabeling, SSR) with the objective of identifying potentially clean samples using criteria such as small loss or predictive consistency, and then correcting false labels by model driven re-labeling. However, conventional small-loss-based selection methods tend to discard boundary samples and noise samples, which can lead to significant information loss and degradation of generalization performance. The correction of labels mainly depends on the high confidence prediction of the model itself to strengthen false labels of errors, if the model is difficult to recognize noise labels with complex image environments in early stages, the confidence of the model on the false predictions is increased, so that the model is over fitted with the errors. In order to solve the limitations, the invention carries out secondary correction on complex labels which are difficult to identify by some models from a characteristic layer through a multi-granularity view on the basis of model confidence correction. The main contributions are 1) an adaptive multi-granularity clustering mechanism that analyzes features across multiple granularity levels while capturing coarse-granularity and fine-granularity data details. 2) A hierarchical label correction strategy first corrects easily resolved noise labels by model prediction, then performs majority voting within a sphere (GB) to perform coarse-grained re-labeling to preserve intra-sphere consistency and suppress ubiquitous errors. 3) A pellet tag propagation component uses similarity weighted intra-sphere voting to enable probability tag diffusion to dilute isolated noise while amplifying reliable signals. Disclosure of Invention The invention aims to overcome the defects in the prior art, and provides a noise label robust training method and device for an image recognition model, so as to solve the problems of high manual labeling cost, high data acquisition error and label noise caused by sample ambiguity. In order to achieve the above purpose, the invention adopts the following technical scheme: in a first aspect, a noise label robust training method for an image recognition model, comprising the steps of: s1, acquiring an image training sample containing a noise label; s2, inputting the image training sample into a deep neural network for feature extraction to obtain feature vectors and corresponding category prediction probabilities of the image in a feature space; S3, constructing a multi-granularity pellet structure in a feature space based on the feature vector, and dividing pellets by adopting a self-adaptive multi-granularity pellet dividing strategy based on pellet purity; s4, performing layering correction on the labels of the image training samples by combining the model prediction confidence, the pellet purity and the pellet structure information; S5, carrying out intra-pellet label propagation by utilizing structural characteristics of intra-pellet image training samples and correction labels, and calculating propagation confidence distribution and consistency scores of each image sample; and S6, screening out a clean sample subset according to the consistency score, and performing iterative training on the deep neural network by using the clean sample subset and the corrected label so as to update network parameters. The method comprises the steps of firstly carrying out forward propagation on an image training sample containing noise l