CN-117557846-B - Uncertainty image recognition method and device for low-quality image data

CN117557846BCN 117557846 BCN117557846 BCN 117557846BCN-117557846-B

Abstract

The invention discloses an uncertainty image recognition method and device for low-quality image data, which comprises the steps of training a first neural network model by using image samples, obtaining the mean value and the variance value of a first multidimensional Gaussian space of each image sample by using the trained first neural network model, storing the mean value and the variance value in a memory, mapping the image samples to a second multidimensional Gaussian space by using a second neural network model, predicting the mean value and the variance value of the second multidimensional Gaussian space, sampling the samples from the second multidimensional Gaussian space, inputting the samples into a second classifier, calculating second loss, retrieving the first multidimensional Gaussian space mean value and the variance value from the memory, calculating contrast Gaussian distillation loss, optimizing parameters of the second neural network model and the second classifier based on the second loss and the Gaussian distillation loss, performing image recognition by using the second neural network model and the second classifier with optimized parameters, and improving recognition robustness of the low-quality image data by changing the characteristic expression capacity of a small-scale network.

Inventors

TANG QIANKUN

Assignees

之江实验室

Dates

Publication Date: 20260508
Application Date: 20231115

Claims (9)

1. An uncertainty knowledge distillation training method for low quality image data, comprising the steps of: acquiring an image sample with a label; mapping an input image sample to a first multidimensional Gaussian space by using a selected first neural network model, predicting the mean value and the variance value of the image sample, sampling the sample from the first multidimensional Gaussian space, inputting the sample into a first classifier to calculate first loss, and optimizing parameters of the first neural network model and the first classifier based on the first loss; obtaining the mean value and the variance value of the first multidimensional Gaussian space of each image sample by using the first neural network model after parameter optimization, and storing the mean value and the variance value into a memory; mapping the input image sample to a second multidimensional Gaussian space by using the selected second neural network model, predicting the mean value and the variance value of the second multidimensional Gaussian space, sampling the sample from the second multidimensional Gaussian space, inputting the sample into a second classifier to calculate a second loss, simultaneously searching the first multidimensional Gaussian space mean value and the variance value from a memory to calculate a contrast Gaussian distillation loss, and optimizing parameters of the second neural network model and the second classifier based on the second loss and the Gaussian distillation loss; the retrieving from memory a first multidimensional gaussian spatial mean and variance value to calculate a comparative gaussian distillation penalty comprising: retrieving a first multidimensional Gaussian spatial mean and variance of a positive image sample in a first neural network model from a memory by using a sample number to serve as a first positive example, and simultaneously taking a second multidimensional Gaussian spatial mean and variance of a positive image sample with the same sample number in a second neural network model to serve as a second positive example; retrieving a first multidimensional Gaussian spatial mean value and a variance of negative image samples of different categories from positive examples in a first neural network model from a memory by using sample numbers, wherein the first multidimensional Gaussian spatial mean value and the variance are used as a first negative example, and a second multidimensional Gaussian spatial mean value and a variance value of the negative image samples of the same sample numbers in a second neural network model are used as a second negative example; positive example similarity between positive and negative examples of the gaussian space is calculated using the mutual likelihood score, negative example similarity between positive and negative examples of the gaussian space is calculated using the mutual likelihood score, and a comparative gaussian distillation loss is calculated based on the positive example similarity and the negative example similarity.
2. The uncertainty knowledge distillation training method for low quality image data according to claim 1 wherein the mapping of the input image samples to a first multidimensional gaussian space and predicting mean and variance values using a selected first neural network model is formalized as: Wherein, the The image sample that is to be input is represented, , A feature embedding function representing a first neural network model, , Representing the mean and variance values of a first multidimensional gaussian space to which the first neural network model maps; When sampling samples from a first multidimensional Gaussian space, sampling K times from the first multidimensional Gaussian space by adopting a Monte Carlo sampling method and a re-parameterization mode, and inputting the K times of sampled samples into a first classifier to calculate a first loss, wherein formalized representation is as follows: Wherein, the And Representing the mean and variance values of the first multidimensional gaussian space, Representing the re-parameters of the kth sample, Representing heavy parameters Is subject to normal distribution and is distributed in a way, A sample representing the kth sample is presented, And All of which represent parameters of the first classifier, For the index of category C, C represents the set of categories, Representing a first loss.
3. The uncertainty knowledge distillation training method for low-quality image data according to claim 1 wherein when storing the mean and variance values of the first multidimensional gaussian space in the memory, the mean and variance values of the first multidimensional gaussian space are stored in the memory with the sample numbers of the image samples as index values.
4. The uncertainty knowledge distillation training method for low quality image data according to claim 1 wherein the mapping of the input image samples to a second multidimensional gaussian space and predicting mean and variance values using a selected second neural network model is formalized as: Wherein, the The image sample that is to be input is represented, , A feature embedding function representing a second neural network model, , Representing the mean and variance values of a second multidimensional gaussian space to which the second neural network model maps; When sampling samples from the second multidimensional Gaussian space, sampling K times from the second multidimensional Gaussian space by adopting a Monte Carlo sampling method and a re-parameterization mode, and inputting the K times of sampled samples into a second classifier to calculate second loss, wherein formalized representation is as follows: Wherein, the And Representing the mean and variance values of the second multidimensional gaussian space, Representing the re-parameters of the kth sample, Representing heavy parameters Is subject to normal distribution and is distributed in a way, A sample representing the kth sample is presented, And All of which represent parameters of the second classifier, For the index of category C, C represents the set of categories, Representing a second loss.
5. Uncertainty knowledge distillation training method for low-quality image data according to claim 1 wherein positive example similarity T between gaussian spatial positive examples is calculated using mutual likelihood scores expressed formally as: Wherein, the And Respectively represent the same sample number The positive image samples of (a) are respectively in a first positive example and a second positive example corresponding to the first neural network model and the second neural network model, Representation of And Is a mutual likelihood score of (2); the negative example similarity N between the positive and negative examples in gaussian space is calculated using the mutual likelihood score, formally expressed as: Wherein, the And Respectively represent the same sample number The negative example image samples of (1) are respectively corresponding to a first negative example and a second negative example of the first neural network model and the second neural network model, Representation of And Is a function of the mutual likelihood score of (a), Representation of And A mutual likelihood score between them; calculating a comparative gaussian distillation loss based on the positive example similarity T and the negative example similarity N Formalized representation is: Where G represents the number of negative examples.
6. An uncertainty knowledge distillation training device for low-quality image data is characterized by comprising an acquisition module, a first updating module, a storage module and a second updating module; the acquisition module is used for acquiring an image sample with a label; The first updating module is used for mapping an input image sample to a first multidimensional Gaussian space by using a selected first neural network model, predicting the mean value and the variance value of the image sample, sampling the sample from the first multidimensional Gaussian space, inputting the sample into a first classifier to calculate a first loss, and optimizing parameters of the first neural network model and the first classifier based on the first loss; the storage module is used for obtaining the mean value and the variance value of the first multidimensional Gaussian space of each image sample by using the first neural network model after parameter optimization and storing the mean value and the variance value into the memory; The second updating module is used for mapping the input image sample to a second multidimensional Gaussian space by using the selected second neural network model, predicting the mean value and the variance value of the second multidimensional Gaussian space, sampling the sample from the second multidimensional Gaussian space, inputting the sample into a second classifier, calculating second loss, simultaneously searching the first multidimensional Gaussian space mean value and the variance value from a memory, calculating contrast Gaussian distillation loss, and optimizing parameters of the second neural network model and the second classifier based on the second loss and the Gaussian distillation loss; the retrieving from memory a first multidimensional gaussian spatial mean and variance value to calculate a comparative gaussian distillation penalty comprising: retrieving a first multidimensional Gaussian spatial mean and variance of a positive image sample in a first neural network model from a memory by using a sample number to serve as a first positive example, and simultaneously taking a second multidimensional Gaussian spatial mean and variance of a positive image sample with the same sample number in a second neural network model to serve as a second positive example; retrieving a first multidimensional Gaussian spatial mean value and a variance of negative image samples of different categories from positive examples in a first neural network model from a memory by using sample numbers, wherein the first multidimensional Gaussian spatial mean value and the variance are used as a first negative example, and a second multidimensional Gaussian spatial mean value and a variance value of the negative image samples of the same sample numbers in a second neural network model are used as a second negative example; positive example similarity between positive and negative examples of the gaussian space is calculated using the mutual likelihood score, negative example similarity between positive and negative examples of the gaussian space is calculated using the mutual likelihood score, and a comparative gaussian distillation loss is calculated based on the positive example similarity and the negative example similarity.
7. An uncertainty image recognition method for low-quality image data, characterized in that the method adopts a second neural network model and a second classifier with optimized parameters obtained by the uncertainty knowledge distillation training method according to any one of claims 1-5, and the recognition method comprises the following steps: Obtaining a test image sample; Inputting the test image sample into a second neural network model with optimized parameters to obtain a second multidimensional Gaussian space corresponding to the test image sample, sampling K times from the second multidimensional Gaussian space to obtain K sampling samples, and inputting the K sampling samples into a second classifier to obtain K classification results; and averaging the K classification results to obtain a final classification result corresponding to the test image sample.
8. An uncertainty image recognition device for low-quality image data is characterized by comprising an input module, a recognition module and an output module; The input module is used for acquiring a test image sample; The identification module is used for inputting the test image sample into the second neural network model with optimized parameters in the method of any one of claims 1-5 to obtain a second multidimensional Gaussian space corresponding to the test image sample, sampling K times from the second multidimensional Gaussian space to obtain K sampling samples, and inputting the K sampling samples into a second classifier to obtain K classification results; and the output module is used for averaging the K classification results to obtain a final classification result corresponding to the test image sample and outputting the final classification result.
9. A computing device comprising a memory and one or more processors, the memory having executable code stored therein, the one or more processors to implement the uncertainty knowledge distillation training method of any of claims 1-5 and the uncertainty image recognition method of claim 7 when the executable code is executed.

Description

Uncertainty image recognition method and device for low-quality image data Technical Field The invention belongs to the field of image recognition, and particularly relates to an uncertainty image recognition method and device for low-quality image data. Background The low-quality image data refers to an image in which image content is difficult to recognize or erroneous to recognize due to illumination, occlusion, blurring, or the like. In particular, small-scale networks have weak modeling capabilities and cannot efficiently identify and process such image data. There are various technical solutions for processing such image data. For example, the publication number CN115984949A is a low-quality face image recognition method and equipment with an attention mechanism, which comprises the steps of obtaining an unknown face image and a trained face image recognition network model, inputting the unknown face image into the face image recognition network model for feature extraction, inputting the generated abstract feature map into a global pooling processing layer, outputting to obtain face feature vectors, calculating the distance between the face feature vectors and all target feature vectors in a search library, and realizing face recognition. The characteristic information extraction layer of the technical characteristics adopts convolution operation with different convolution kernel sizes and various characteristic fusion modes, and is matched with a plurality of stride convolution layers and pooling layers, so that the effect of enhancing the image characteristic extraction effect is achieved. For example, CN113962862A, a super-resolution-based low-quality image recognition method, device, equipment and medium, can automatically generate a normal resolution image and a low resolution image which are matched one by one, construct an initial network and construct a loss function, wherein the initial network comprises a super-resolution network and a classification network which are sequentially connected, the loss function comprises super-resolution loss and classification loss, a training sample is utilized to train the initial network until the loss function is converged, training is stopped, a low-quality image recognition model is obtained, a target image is input into the low-quality image recognition model, a recognition result is determined according to the output of the low-quality image recognition model, and fine granularity classification of the low-resolution image is realized by combining the super-resolution network and the classification network. Knowledge distillation is a method for transmitting knowledge expression of a neural network model with high precision but complex to a neural network model with smaller scale, which can obviously improve the recognition precision of a small-scale network. However, the prior art distillation technology does not consider improving the robustness of recognition of low quality image data by small-scale networks (such as MobileNetV2, shuffleV, VGG 8), which limits the practical application of small-scale networks. Disclosure of Invention In view of the foregoing, it is an object of the present invention to provide an uncertainty image recognition method and apparatus for low quality image data, which improves recognition robustness for low quality image data by changing feature expression capability of a small-scale network. To achieve the above object, an embodiment provides an uncertainty image recognition method for low-quality image data, including the steps of: acquiring an image sample with a label; mapping an input image sample to a first multidimensional Gaussian space by using a selected first neural network model, predicting the mean value and the variance value of the image sample, sampling the sample from the first multidimensional Gaussian space, inputting the sample into a first classifier to calculate first loss, and optimizing parameters of the first neural network model and the first classifier based on the first loss; obtaining the mean value and the variance value of the first multidimensional Gaussian space of each image sample by using the first neural network model after parameter optimization, and storing the mean value and the variance value into a memory; Mapping the input image samples to a second multidimensional Gaussian space and predicting mean and variance values thereof using the selected second neural network model, sampling the samples from the second multidimensional Gaussian space and inputting the samples into a second classifier to calculate a second loss, simultaneously retrieving the first multidimensional Gaussian space mean and variance values from memory to calculate a comparative Gaussian distillation loss, and optimizing parameters of the second neural network model and the second classifier based on the second loss and the Gaussian distillation loss. Preferably, the mapping the input image sample to the fi