CN-121997109-A - Noise tag robust learning method and system based on DI and SNOP

CN121997109ACN 121997109 ACN121997109 ACN 121997109ACN-121997109-A

Abstract

The invention discloses a noise label robust learning method and system based on DI and SNOP, and belongs to the technical field of deep learning model training and optimization. The method comprises the steps of 1, constructing a ResNet-18 neural network model, 2, extracting features of a training data set containing noise labels by using a feature extractor to obtain feature vectors, 3, dividing a semantic related sample pair set and a semantic irrelevant sample pair set in a manner of adopting semantic distance of label pairs according to a set threshold, 4, obtaining total loss of the ResNet-18 neural network model, 5, updating model parameters theta through back propagation by using the total loss of the ResNet-18 neural network model to obtain a trained ResNet-18 neural network model, 6, inputting test data into the trained ResNet-18 neural network model to obtain a multi-mode scene classification task result, and compared with the prior art, under the condition that noise labels exist in the classification task, improving the robustness of the deep learning model.

Inventors

LI KAN
FAN WENXIAO

Assignees

北京理工大学

Dates

Publication Date: 20260508
Application Date: 20251201

Claims (6)

1. A noise label robust learning method based on DI and SNOP is characterized by comprising the following steps, Step 1, constructing a ResNet-18 neural network model, and randomly initializing model parameters theta; Step 2, extracting the characteristics of the training data set containing the noise labels by utilizing a characteristic extractor to obtain characteristic vectors; Step 3, dividing a semantic related sample pair set and a semantic unrelated sample pair set in a manner of adopting semantic distance of a label pair according to a set threshold; Step 3.1, acquiring semantic distances of label pairs consisting of two training data set sample labels as shown in a formula (2) by using a class hierarchy structure with dissimilarity; where h (·) is a depth function of the class hierarchy tree, NL (·) represents the lowest common ancestor node; Step 3.2, setting a semantic dissimilarity threshold eta min ,η max , and dividing the semantic dissimilarity threshold eta min ,η max into a semantic related sample pair set and a semantic uncorrelated sample pair set shown in the formula (3) according to the set threshold; Wherein the semantically related sample set is The semantically uncorrelated sample set is Step 4, calibrating similarity adjustment loss by using cross entropy loss, structured negative orthogonal penalty loss and dissimilarity to obtain total loss of the ResNet-18 neural network model; Step 5, updating the model parameter theta by back propagation by utilizing the total loss of the ResNet-18 neural network model to obtain a trained ResNet-18 neural network model; And 6, inputting the test data into the trained ResNet-18 neural network model to obtain a multi-mode scene classification task result.
2. The method for robust learning of noise tags based on DI and SNOP as set forth in claim 1, wherein the step 2 is implemented by, Step 2.1, acquiring a training data set containing noise labels as shown in the formula (1); Wherein x i denotes the input data, As labels which are possibly polluted, N is the total number of samples; Step 2.2, extracting the features of the training data set by using a feature extractor f (& theta) to obtain feature vectors
3. The method for robust learning of noise tags based on DI and SNOP as set forth in claim 1, wherein the step 4 is implemented by, Step 4.1, obtaining a structured negative orthogonal penalty loss for a semantic uncorrelated sample; step 4.2, obtaining dissimilarity calibration similarity adjustment loss aiming at a semantic correlation sample; Step 4.3, calibrating the similarity adjustment loss by using the cross entropy loss, the structured negative orthogonal penalty loss and the dissimilarity to obtain the total loss of the ResNet-18 neural network model shown in the formula (9); L total ＝L org +λL SNOP +μL DCSA (9) where L org is the cross entropy loss, λ, μ is the balance coefficient.
4. The noise label robust learning method based on DI and SNOP, in accordance with claim 3, wherein the step 4.1 implementation method is, Step 4.1.1-use of a structured negative orthogonal penalty method (SNOP, structured Negative Or thogonality Penalty) for semantic uncorrelated sample sets Constructing a difference vector set; Step 4.1.2, respectively obtaining global orthogonality loss and local confidence weighted orthogonality loss by using a formula (5); Wherein w ij ＝(1-c i )·(1-c j );c i is the confidence of the prediction of sample i, i.e., the softmax probability of its predicted class; And 4.1.3, obtaining SNOP loss of the difference vector set shown in the formula (6) by adopting the global orthogonality loss and the local confidence weighting orthogonality loss. L SNOP ＝L global +L local (6)
5. The noise label robust learning method based on DI and SNOP, in accordance with claim 3, wherein the step 4.2 is implemented by, Step 4.2.1 calibration of similarity adjustment with dissimilarity (DCSA, DISSIMILARITY-Calibrated Similarity Adjustment) on semantically related sample sets Obtaining a calibration factor as shown in formula (7); Wherein, p and q are expressed as dissimilarity anchor points, namely corresponding semantic irrelevant samples, i and j are corresponding semantic relevant samples, T is vector transposition; step 4.2.2, acquiring DCSA loss shown in the formula (8) by using cosine similarity; Wherein the method comprises the steps of Is cosine similarity.
6. The noise label robust learning system based on DI and SNOP for implementing the method as claimed in claim 1, comprising a data processing module, a feature extraction module, a dissimilarity analysis module, a SNOP module, a DCSA module and a loss calculation and model training module; The data processing module is used for reading the training data set containing the noise label, and performing data cleaning and preprocessing; the feature extraction module is used for mapping the input samples to a feature representation space, and outputting corresponding feature vectors as a result by performing feature extraction operation on the data set; The dissimilarity analysis module is used for dividing the sample pairs into semantic related sample pair sets according to a set threshold under the semantic distance of the sample pairs obtained based on the class hierarchy structure Semantic uncorrelated sample pair sets Will be the inputs to the SNOP module and the DCSA module; SNOP module for obtaining feature difference vector set and applying global and local orthogonality constraint to semantic uncorrelated sample pair to obtain structured negative orthogonal penalty loss, which is used as input of loss calculation and model training module; The DCSA module is used for acquiring a calibration factor through a dissimilarity anchor point by using a semantic correlation sample pair, setting a self-adaptive soft upper limit on similarity, punishing a positive sample pair exceeding the upper limit to acquire dissimilarity calibration similarity adjustment loss, and taking the loss adjustment loss as input of the loss calculation and model training module; The loss calculation and model training module is used for calculating the cross entropy loss of the model on the training data set, combining the cross entropy loss, SNOP loss and DCSA loss into a total loss function in a weighting mode, and utilizing an optimization algorithm to update model parameters in an iteration mode to realize end-to-end robust training.

Description

Noise tag robust learning method and system based on DI and SNOP Technical Field The invention relates to a noise label robust learning method and system based on DI and SNOP, which belong to the technical field of deep learning model training and optimization, and are suitable for multi-mode scene classification tasks such as image classification, voice recognition, natural language processing and the like under the condition that label data have noise or error labels. Background Deep learning technology has recently made remarkable progress in the fields of computer vision, speech recognition, natural language processing, and the like. However, in the practical application scenario, it is often difficult for the training data to completely guarantee the accuracy of the labeling, and the noise label (Noisy Labels) problem easily occurs. The existing noise label learning method mainly adopts a method 1, namely a loss correction method, wherein the influence of the noise label is weakened by estimating a noise transfer matrix or correcting a loss function. However, the noise transfer matrix is difficult to estimate accurately, and uniformly correcting all samples can result in accumulated errors, affecting model performance. 2. The sample selection method comprises the steps of selecting a clean sample for training based on a loss value or feature similarity, and discarding suspected noise samples. However, this method usually discards a large amount of data, resulting in knowledge waste. 3. And the semi-supervised learning method is to consider the suspected noise sample as unlabeled data, and the robustness of the model is improved through a semi-supervised learning strategy. The method is sensitive to super parameter setting and increases training complexity. The method is mostly dependent on sample similarity as a learning signal, but under the condition that a noise label exists, semantic similarity is easy to break, so that a model learns wrong association, and generalization performance is reduced. However, in a noisy tag environment, the semantic dissimilarity is referred to as dissimilarity invariance (DI, dissimilarity Invariance). The dissimilarity signal is less sensitive to noise than the fragile similarity signal and can serve as a more reliable training anchor point. However, this phenomenon has not been systematically applied in the noise tag learning framework in the prior art, and there is also a lack of systematic methods to combine it with feature orthogonality constraints and similarity adaptation. Therefore, how to improve the robustness of the deep learning model in the case that the classification task has noise labels has become a problem to be solved. Disclosure of Invention The invention aims to solve the technical problem of improving the robustness of a deep learning model in a noise label scene of a classification task, and provides a noise label robust learning method and system based on DI and SNOP. By using the dissimilarity signal which is still stable in the tag noise environment as the robust spud point and combining with the structural orthogonal constraint of the feature space and the similarity self-adaptive adjustment, the false similarity caused by the noise tag is effectively restrained, so that the robustness and the generalization capability of the model in the high noise environment are remarkably improved. The invention aims at realizing the following technical scheme: The invention discloses a noise label robust learning method based on DI and SNOP, which comprises the following steps of The steps are as follows: step 1, constructing a ResNet-18 neural network model, and randomly initializing model parameters theta; Step 2, extracting the characteristics of the training data set containing the noise labels by utilizing a characteristic extractor to obtain characteristic vectors; Step 2.1, acquiring a training data set containing noise labels as shown in the formula (1); Wherein x i denotes the input data, As labels which are possibly polluted, N is the total number of samples; Step 2.2, extracting the features of the training data set by using a feature extractor f (& theta) to obtain feature vectors Step 3, dividing a semantic related sample pair set and a semantic unrelated sample pair set in a manner of adopting semantic distance of a label pair according to a set threshold; Step 3.1, acquiring semantic distances of label pairs consisting of two training data set sample labels as shown in a formula (2) by using a class hierarchy structure of dissimilarity invariance (DI, dissimilarity Invariance); where h (·) is a depth function of the class hierarchy tree, NL (·) represents the lowest common ancestor node; Step 3.2, setting a semantic dissimilarity threshold eta min,ηmax, and dividing the semantic dissimilarity threshold eta min,ηmax into a semantic related sample pair set and a semantic uncorrelated sample pair set shown in the formula (3) according to the