CN-122021778-A - Dual confidence calibration method, system, equipment and medium for neural network classifier

CN122021778ACN 122021778 ACN122021778 ACN 122021778ACN-122021778-A

Abstract

The invention provides a dual confidence calibration method, a system, equipment and a medium for a neural network classifier, which effectively solve the problems that the existing single-stage calibration method is difficult to consider overstatory, understatory, unbalanced class sensitivity and the like, promote the neural network classifier to generate well-calibrated prediction distribution by introducing a mixed loss function integrating double focus loss and difference between multi-class confidence and accuracy in a training stage, adopt a class-by-class multi-partition temperature scaling model optimized based on a coupling simulated annealing method in an reasoning stage, more accurately adapt to the calibration requirements of different classes and different confidence intervals compared with the traditional single-temperature coefficient temperature scaling technology, save the classifier of different rounds in the training process, and perform class-by-class average on multi-model calibration results in a testing stage, effectively improve the calibration stability and reduce the prediction confidence error, and are particularly suitable for the safety key fields with extremely high requirements on prediction reliability, such as medical diagnosis and automatic driving.

Inventors

QIAN XUSHENG
DAI YAKANG
ZHOU ZHIYONG
HU JISU
ZHENG YI

Assignees

中国科学院苏州生物医学工程技术研究所
苏州医疗器械产业发展集团有限公司

Dates

Publication Date: 20260512
Application Date: 20251210

Claims (10)

1. The dual confidence calibration method of the neural network classifier is characterized by comprising the following steps of: constructing a loss function for confidence calibration in a neural network training stage; Designing a category-by-category multi-partition temperature scaling model for neural network reasoning stage confidence calibration; constructing a dual confidence calibration model based on the loss function and the category-by-category multi-partition temperature scaling model; In the test application stage, based on a plurality of neural network classifier models stored in different rounds and corresponding class-by-class multi-partition temperature scaling models, the calibrated prediction probability vector of each classifier is subjected to class-by-class average, the maximum average probability is the prediction confidence, and the corresponding class is the prediction class.
2. The method of dual confidence calibration of a neural network classifier of claim 1, wherein said constructing a loss function for confidence calibration of the neural network training phase comprises: dividing the classified data set to be processed into a training set and a verification set according to a preset proportion; Optimizing neural network parameters based on training set and adopting mixed loss function oriented to confidence calibration 。
3. The method of dual confidence calibration of a neural network classifier of claim 2, wherein the hybrid loss function Including the bifocal loss function and the difference loss between multi-class confidence and correctness: Wherein, the The weight coefficient is represented by a number of weight coefficients, Is a factor of the regulation and is used for regulating the quantity of the liquid, Representing the number of samples for a batch, The number of categories is indicated and, Representing the probability that the neural network model predicts the ith sample as the mth class, Representing the predicted probability of the network model next to the real class in the predicted probability of each class of the ith sample, when the real class of the ith sample is the mth class, If not, the first part of the first part is connected with the second part, 。
4. A method of dual confidence calibration for a neural network classifier as claimed in claim 3, wherein said step of designing a class-by-class multi-partition temperature scaling model for neural network reasoning stage confidence calibration comprises: For input samples The logits vector output by the neural network classifier is noted as The prediction category is The corresponding prediction confidence is ; By setting independent temperature coefficients for different confidence intervals of each category and introducing the number of the category into consideration Equal width division and equal frequency division of the confidence range R-aliquoting is performed, and equal-frequency dividing is performed under the condition of ensuring that the number of samples contained in each interval is the same Dividing the temperature into V sections, and establishing a category-by-category multi-partition temperature scaling model : Wherein the prediction confidence after calibration using temperature scaling is noted as , Is a temperature coefficient of the silicon carbide material, The Softmax function is represented as a function of, Representing the parameters of the model to be optimized, The weight coefficient is represented by a number of weight coefficients, And Temperature coefficient vectors corresponding to respective sections of class-by-class equal-width division and equal-frequency division respectively.
5. The method of dual confidence calibration of a neural network classifier of claim 4, wherein said step of constructing a dual confidence calibration model based on said loss function and said class-by-class multi-partition temperature scaling model comprises: Setting the total training round of the neural network classifier as E, using AdamW algorithm by minimizing The loss function optimizes the neural network classifier, and one neural network classifier is stored after each 0.2E round of network training; based on the over-confidence or under-confidence condition of each class and each confidence interval of the current neural network classifier on the verification set data, optimizing parameters of a class-by-class multi-partition temperature scaling model by using a coupling simulated annealing algorithm And constructing a confidence calibration model adapting to the current neural network training state.
6. The method for calibrating double confidence levels of a neural network classifier according to claim 5, wherein the expected calibration error ECE of the neural network classifier on the verification set data is minimized by using a coupled simulated annealing algorithm: Wherein, the And Representing the prediction confidence errors due to the over-confidence and under-confidence of the neural network classifier respectively, Indicating the number of confidence intervals, And Respectively represent the classifier at the first Class III The prediction accuracy and average prediction confidence obtained from each confidence interval.
7. The method of dual confidence calibration of a neural network classifier of claim 6, wherein said optimizing parameters of a class-by-class multi-partition temperature scaling model using a coupled simulated annealing algorithm comprises: setting parallel The annealing process is simulated in a single step, Group model parameters Is set up in the initial set of (a) The maximum iteration number is K, and the current iteration number Initial generation temperature And an acceptance temperature ; Generating new model parameters by perturbation function Calculating ECE corresponding to the current model as energy of the current simulated annealing process Wherein the form of the perturbation function is as follows: Wherein, the Is a number randomly generated each time a new parameter is generated; as a scaling factor, controlling the magnitude of the disturbance amplitude; If it is Then accept ; Otherwise, by Probability of acceptance of (a) Updating the current optimal solution corresponding to each simulated annealing process; Wherein the coupling term ; Order the , , If the temperature threshold value or the maximum iteration number K is not reached, returning to the process of generating new model parameters through the disturbance function A step of; otherwise, outputting the optimal model parameters of the minimum prediction confidence error of the neural network searched by the CSA and corresponding to the current training round 。
8. A dual confidence calibration system of a neural network classifier, which is applied with the method as claimed in any one of claims 1 to 7, is characterized by comprising a loss function construction module, a category-by-category multi-partition temperature scaling model design module, a dual confidence calibration model construction module and a prediction result calibration module, wherein, The loss function construction module is used for constructing a loss function for confidence calibration in the neural network training stage; the category-by-category multi-partition temperature scaling model design module is used for designing a category-by-category multi-partition temperature scaling model for neural network reasoning stage confidence calibration; the dual confidence calibration model construction module is used for constructing a dual confidence calibration model based on the loss function and the category-by-category multi-partition temperature scaling model; The prediction result calibration module is used for carrying out class-by-class average on the prediction probability vector of each classifier after calibration based on a plurality of neural network classifier models stored in different rounds and class-by-class multi-partition temperature scaling models corresponding to the neural network classifier models in a test application stage, wherein the maximum average probability is prediction confidence, and the corresponding class is prediction class.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1-7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1-7.

Description

Dual confidence calibration method, system, equipment and medium for neural network classifier Technical Field The invention relates to the technical field of machine learning, in particular to a dual confidence calibration method, a system, equipment and a medium of a neural network classifier. Background Neural network classifiers have achieved significant success in many areas, but their confidence calibration problem has remained a major challenge in practical use to date. The ideal confidence calibration should be such that when the model outputs a certain confidence value, its actual prediction accuracy matches it. For example, if the model gives an 80% confidence prediction for 100 samples, then about 80 samples should be correctly classified. The neural network classifier with good calibration degree can help a user to prepare for evaluating the risk of prediction errors, and is important in the safety key fields such as medical diagnosis, automatic driving and the like. However, modern neural networks (especially deep networks) are prone to excessive confidence problems, i.e., models tend to give higher confidence scores for both correct and incorrect predictions. In addition, in dealing with the problem of class imbalance, the classifier is prone to over-confidence for the majority class samples and under-confidence for the minority class samples. These situations not only increase the prediction confidence error, but also increase the difficulty of confidence calibration. The existing confidence calibration method can be divided into two categories, namely network training phase optimization and network reasoning phase post-processing. The method of the network training stage mainly realizes confidence calibration by optimizing a loss function or introducing a regularization mechanism, such as focal point loss function, label smoothing and the like. The method in the network reasoning stage does not change the parameters of the classification model, and only adjusts the prediction probability distribution through post-processing, mainly comprising temperature scaling, order preserving regression and the like. The existing research generally only uses single-stage calibration, or considers two-stage calibration as independent flow, and the lack of a dual-confidence calibration method for collaborative optimization in the two stages restricts the further improvement of the confidence calibration effect. Disclosure of Invention To achieve the above and other advantages and in accordance with the purpose of the present invention, a first object of the present invention is to provide a dual confidence calibration method of a neural network classifier, comprising the steps of: constructing a loss function for confidence calibration in a neural network training stage; Designing a category-by-category multi-partition temperature scaling model for neural network reasoning stage confidence calibration; constructing a dual confidence calibration model based on the loss function and the category-by-category multi-partition temperature scaling model; In the test application stage, based on a plurality of neural network classifier models stored in different rounds and corresponding class-by-class multi-partition temperature scaling models, the calibrated prediction probability vector of each classifier is subjected to class-by-class average, the maximum average probability is the prediction confidence, and the corresponding class is the prediction class. Further, the step of constructing a loss function for confidence calibration in the neural network training phase includes: dividing the classified data set to be processed into a training set and a verification set according to a preset proportion; Optimizing neural network parameters based on training set and adopting mixed loss function oriented to confidence calibration 。 Further, the mixing loss functionIncluding the bifocal loss function and the difference loss between multi-class confidence and correctness: Wherein, the The weight coefficient is represented by a number of weight coefficients,Is a factor of the regulation and is used for regulating the quantity of the liquid,Representing the number of samples for a batch,The number of categories is indicated and,Representing the probability that the neural network model predicts the ith sample as the mth class,Representing the predicted probability of the network model next to the real class in the predicted probability of each class of the ith sample, when the real class of the ith sample is the mth class,If not, the first part of the first part is connected with the second part,。 Further, the step of designing a category-wise multi-partition temperature scaling model for neural network reasoning stage confidence calibration includes: For input samples The logits vector output by the neural network classifier is noted asThe prediction category isThe corresponding prediction confidence is; By setting in