CN-116797904-B - Image identification uncertainty knowledge distillation method and system

CN116797904BCN 116797904 BCN116797904 BCN 116797904BCN-116797904-B

Abstract

The invention discloses an image recognition uncertainty knowledge distillation method and system, which are used for collecting label training image samples, selecting a first neural network model, training by using the training image samples to obtain a trained first neural network model, inputting training sample images to obtain intermediate layer sample feature expression of the first neural network model and output soft label information, selecting a second neural network model, processing the training image samples to obtain intermediate layer sample feature expression, carrying out uncertainty modeling on the intermediate layer sample feature expression of the first neural network model to obtain a first loss function, and updating parameters of the second neural network model by using the soft label information and the training image samples output by the first neural network model and combining the first loss function to obtain a trained second neural network model.

Inventors

TANG QIANKUN
LI XIAOYUAN
WANG JUN
XU XIAOGANG
FENG XIANZHONG
YU HUI
HE PENGFEI
LI YUE
HAN QIANG
CAO WEIQIANG

Assignees

之江实验室
中国科学院东北地理与农业生态研究所

Dates

Publication Date: 20260505
Application Date: 20230424

Claims (10)

1. An image identification uncertainty knowledge distillation method comprises the following specific steps: S100, collecting sample images, processing and labeling to obtain a first number of labeled training image samples; S200, selecting a first neural network model, training by using a first number of training image samples, updating parameter values of the network model to obtain a trained first neural network model, and inputting training sample images into the first neural network model to obtain model intermediate layer sample feature expression and soft label information output by image processing; S300, selecting a second neural network model, processing a first number of training image samples to obtain an intermediate layer sample characteristic expression, and performing uncertainty modeling on the intermediate layer sample characteristic expression obtained by the first neural network model to obtain a first loss function, wherein the specific steps comprise: s301, selecting a middle convolution layer of a second neural network model, and inputting training image samples to obtain a middle layer sample characteristic expression of the second neural network model; s302, calculating semantic similarity between a sample characteristic expression of a middle layer of the second neural network model and a channel of a sample characteristic expression of a middle layer of the first neural network model; s303, calculating semantic similarity between the sample feature expression of the middle layer of the second neural network model and the spatial feature expression of the sample feature expression of the middle layer of the first neural network model; s304, obtaining the average value of the feature expression uncertainty of each sample of the first neural network model according to the calculated channel semantic similarity and the space feature expression semantic similarity; s305, according to the calculated channel semantic similarity and the spatial feature expression semantic similarity, further processing by using a full-connection layer to obtain a variance value of each sample feature expression uncertainty of the first neural network model; S306, obtaining an uncertainty value by using a heavy parameterization skill according to the calculated average value and variance value of the sample characteristic expression uncertainty, so as to obtain a knowledge distillation first loss function; S400, processing soft label information and training image samples by using the image output by the first neural network model, combining the first loss function to obtain an overall loss function, and updating parameters of the second neural network model to obtain a trained second neural network model.
2. The method for distilling the uncertainty knowledge of image recognition according to claim 1, wherein the steps of processing the soft tag information and training the image samples by using the image output by the first neural network model, combining the first loss function to obtain an overall loss function and updating the parameters of the second neural network model include: s401, processing the training sample image by using a second neural network model, and outputting the processed prediction label information; S402, using image processing soft label information output by the first neural network model and prediction label information output by the second neural network model to calculate and obtain a second loss function; S403, calculating to obtain a third loss function by using the predicted tag information output by the second neural network model and the tag information of the training image sample; s404, adding the first loss function, the second loss function and the third loss function to obtain an overall loss function.
3. An image recognition uncertainty knowledge distillation system for implementing the image recognition uncertainty knowledge distillation method of claim 1, comprising: The acquisition module is used for collecting sample images, processing and labeling the sample images to obtain a first number of labeled training image samples; The first updating module is connected with the acquisition module and is used for selecting a first neural network model, training by using a first number of training image samples, updating parameter values of the network model to obtain a trained first neural network model, and inputting training sample images to the first neural network model to obtain model intermediate layer sample characteristic expression and soft label information output by image processing; The uncertainty modeling module is connected with the acquisition module and the first updating module and is used for selecting a second neural network model, processing a first number of training image samples to obtain middle layer sample characteristic expression, and carrying out uncertainty modeling on the middle layer sample characteristic expression obtained by the first neural network model to obtain a first loss function; The second updating module is connected with the acquisition module, the first updating module and the uncertainty modeling module and is used for processing the soft label information and training the image samples by using the image output by the first neural network model, combining the first loss function to obtain an overall loss function and updating parameters of the second neural network model to obtain a trained second neural network model.
4. An image recognition uncertainty knowledge distillation system as claimed in claim 3, wherein said uncertainty modeling module comprises in particular: selecting a middle convolution layer of the second neural network model, and inputting training image samples to obtain a middle layer sample characteristic expression of the second neural network model; calculating semantic similarity between the sample feature expression of the middle layer of the second neural network model and the channel of the sample feature expression of the middle layer of the first neural network model; calculating semantic similarity between the intermediate layer sample feature expression of the second neural network model and the spatial feature expression of the intermediate layer sample feature expression of the first neural network model; according to the calculated channel semantic similarity and the spatial feature expression semantic similarity, a mean value of the uncertainty of each sample feature expression of the first neural network model is obtained; According to the calculated channel semantic similarity and the spatial feature expression semantic similarity, further processing by using a full-connection layer to obtain a variance value of each sample feature expression uncertainty of the first neural network model; And obtaining an uncertainty value by using a heavy parameterization skill according to the calculated average value and variance value of the sample characteristic expression uncertainty, so as to obtain a knowledge distillation first loss function.
5. The image recognition uncertainty knowledge distillation system as set forth in claim 3, wherein said second updating module comprises: Processing the training sample image by using a second neural network model, and outputting the processed prediction label information; using the image processing soft label information output by the first neural network model and the prediction label information output by the second neural network model to calculate and obtain a second loss function; calculating the predicted label information output by the second neural network model and the label information of the training image sample to obtain a third loss function; The first loss function, the second loss function and the third loss function are added to obtain an overall loss function.
6. An image processing method employing an image recognition uncertainty knowledge distillation method of claim 1, comprising: acquiring a second number of to-be-processed test image samples; performing image recognition processing on the second number of to-be-processed test image samples by using a second neural network model trained by the image recognition uncertainty knowledge distillation method; and obtaining and outputting the identification processing result.
7. An image processing system for implementing the method of claim 6, comprising: the acquisition module is used for acquiring a second number of to-be-processed test image samples; The identification processing module is connected with the acquisition module and is used for carrying out image identification processing on the second number of to-be-processed test image samples by utilizing the second neural network model trained by the image identification uncertainty knowledge distillation system; The identification output module is connected with the identification processing module and used for acquiring an image to be processed, carrying out image identification processing on the second number of image samples to be processed by utilizing the second neural network model trained by the image identification uncertainty knowledge distillation system, and obtaining and outputting an identification processing result.
8. An image recognition processing device, comprising an image collector, a memory, one or more processors and an external output device, wherein the image collector is used for collecting a first number of image samples and a second number of image samples, executable codes are stored in the memory, the one or more processors are used for realizing the image recognition uncertainty knowledge distillation method according to claim 1 and the image processing method according to claim 6 when executing the executable codes, and the external output device is used for outputting and displaying the image recognition processing result obtained by the image processing method according to claim 7.
9. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute an image recognition uncertainty knowledge distillation method of claim 1 and an image processing method of claim 6.
10. A computer program product comprising computer programs/instructions which when executed by a processor implement an image recognition uncertainty knowledge distillation method as claimed in claim 1 and an image processing method as claimed in claim 6.

Description

Image identification uncertainty knowledge distillation method and system Technical Field The invention relates to the field of computer vision, in particular to an image identification uncertainty knowledge distillation method and system. Background Knowledge distillation is one of neural network model compression and acceleration technology, and can effectively reduce the resource requirements of neural network-based image classification or target detection and other models on resource-constrained equipment, and meanwhile, the recognition accuracy is kept high. The basic principle is that the feature expression and the predicted image label information of a neural network model (a first neural network model) with high image recognition precision after training are used for guiding the training of another neural network model (a second neural network model) with less parameters and calculation amount. The method can obviously improve the image recognition accuracy of the second neural network model with little required resources. However, the knowledge expression is considered to be accurate and strong in discrimination when the knowledge expression is extracted and distilled from the first neural network model by the current knowledge distillation technology, noise information and other misleading information contained in the knowledge expression in the first neural network model are ignored, so that the knowledge expression capability obtained by the second neural network model is weaker, and further improvement of image recognition accuracy is prevented. Disclosure of Invention In order to solve the defects in the prior art, more deterministic knowledge expressions with strong discriminant can be extracted during knowledge distillation, and the aim of improving the image recognition precision of the second neural network model is fulfilled, the invention adopts the following technical scheme: In one aspect, the invention provides an image recognition uncertainty knowledge distillation method, comprising: S100, collecting sample images, processing and labeling to obtain a first number of labeled training image samples; S200, selecting a first neural network model, training by using a first number of training image samples, updating parameter values of the network model to obtain a trained first neural network model, and inputting training sample images into the first neural network model to obtain model intermediate layer sample feature expression and soft label information output by image processing; s300, selecting a second neural network model, processing a first number of training image samples to obtain an intermediate layer sample characteristic expression, and performing uncertainty modeling on the intermediate layer sample characteristic expression obtained by the first neural network model to obtain a first loss function; S400, processing soft label information and training image samples by using the image output by the first neural network model, combining the first loss function to obtain an overall loss function, and updating parameters of the second neural network model to obtain a trained second neural network model; Optionally, selecting a second neural network model, processing a first number of training image samples to obtain an intermediate layer sample feature expression, and performing uncertainty modeling with the intermediate layer sample feature expression obtained by the first neural network model to obtain a first loss function, including: selecting a middle convolution layer of the second neural network model, and inputting training image samples to obtain a middle layer sample characteristic expression of the second neural network model; calculating semantic similarity between the sample feature expression of the middle layer of the second neural network model and the channel of the sample feature expression of the middle layer of the first neural network model; calculating semantic similarity between the intermediate layer sample feature expression of the second neural network model and the spatial feature expression of the intermediate layer sample feature expression of the first neural network model; according to the calculated channel semantic similarity and the spatial feature expression semantic similarity, a mean value of the uncertainty of each sample feature expression of the first neural network model is obtained; According to the calculated channel semantic similarity and the spatial feature expression semantic similarity, further processing by using a full-connection layer to obtain a variance value of each sample feature expression uncertainty of the first neural network model; obtaining an uncertainty value by using a heavy parameterization skill according to the calculated average value and variance value of the sample characteristic expression uncertainty, so as to obtain a knowledge distillation first loss function; optionally, using the image processing soft