CN-114037871-B - Image classification interpretable method based on neural support decision tree

CN114037871BCN 114037871 BCN114037871 BCN 114037871BCN-114037871-B

Abstract

The invention discloses an image classification interpretable method based on a neural support decision tree. Firstly, obtaining trained CNN for image classification tasks, extracting parameter vectors of all connection layers of the CNN, regarding the parameter vectors as representative vectors of leaf nodes, continuously searching similar nodes and generating parent nodes and representative vectors thereof by using aggregation hierarchical clustering until a root node to obtain a neural support decision tree embedded in the CNN, inputting an image to be detected into a CNN model, inputting the neural support decision tree to obtain a decision link from the root node to the leaf nodes and decision discrimination probability of each node, finally, calculating thermodynamic diagrams of all leaf nodes under different intermediate nodes on the link by using Grad-CAM, carrying out weighted superposition according to the probabilities of the similar leaf nodes, and finally outputting visual interpretation combining the thermodynamic diagrams and the decision tree. The invention ensures that the process of learning model decision can be displayed by a procedural decision tree, and simultaneously can generate thermodynamic diagrams for revealing the focus of decision attention, thereby achieving better interpretation effect.

Inventors

KONG XIANGWEI
LI ZHILI

Assignees

浙江大学

Dates

Publication Date: 20260508
Application Date: 20211109

Claims (4)

1. An image classification interpretable method based on a neural support decision tree, the method comprising the steps of: Step 1), acquiring trained CNN used for an image classification task, and extracting a full connection layer parameter vector of the CNN; Step 2) taking the parameter vector of the full connection layer as a representative vector of the leaf node, continuously searching similar nodes by utilizing the aggregation hierarchical clustering, generating a father node and the representative vector thereof, and obtaining a neural support decision tree embedded with the CNN until the father node; Step 3) inputting the image to be detected into a CNN model to obtain a feature vector before the full connection layer as an image feature vector; step 4) inputting the image feature vector into a neural support decision tree to obtain a decision link from a root node to a leaf node, and obtaining the selection probability of each intermediate node on the decision link entering a left child node and entering a right child node; Step 5) calculating thermodynamic diagrams of all leaf nodes under different intermediate nodes on a link by using Grad-CAM, carrying out weighted superposition according to probability of similar leaf nodes, and finally outputting visual interpretation combining the thermodynamic diagrams and decision trees, wherein the method comprises the following specific steps: before generating the thermodynamic diagram for each node on the decision link, the Grad-CAM method is used to generate class activation map features for all leaf nodes First, compute leaf nodes Probability of (2) Feature map of last convolutional layer for CNN network The gradient information obtained on different pixels is averaged to obtain the characteristic map For leaf nodes Is a neuron importance weight of (2) And uses the neuron importance weight to the last convolution layer Weighting the feature graphs, and finally keeping positive values on the feature graphs to obtain leaf nodes Is a class activation mapping feature map The specific calculation process is as follows: Wherein the method comprises the steps of The number of pixels representing the feature map, For the global averaging pooling operation, Is a leaf node Is used for determining the decision probability of (a), Represents the first Characteristic diagram The pixel value at the location is determined, In order to utilize the back-propagation computed gradient information, Calculating to obtain class activation mapping characteristic diagram of all leaf nodes After which the decision tree is re-entered Superposing all leaf node class activation mapping feature graphs contained in the left subtree and the right subtree respectively to obtain superposition class activation mapping feature graphs of the left subtree and the right subtree respectively, and performing secondary weighted superposition on the two superposition feature graphs by taking the probability of the left subtree and the right subtree as weights to finally obtain class activation mapping feature graphs of the next decision node of the root node The method comprises the steps of judging important areas concerned when a root node is in decision making to a next node, namely visual interpretation sources, repeating the calculation process of the left subtree feature map and the right subtree feature map after the root node continuously enters the next node of a decision link until a leaf node is reached, and obtaining each node category activation mapping feature map on the decision link, wherein the specific calculation process is as shown in the following formula: Wherein the method comprises the steps of Representing leaf nodes Is used to activate the map feature map for the category, Respectively represent decision nodes The set of leaf nodes contained under the left and right subtrees, Respectively represent entry decision nodes The probability of selection of the left and right child nodes, Finger correction linear unit, and class activation mapping feature map And carrying out the restore operation to generate a thermodynamic diagram with the same size as the original image, namely, highlighting an important area on the input image, which influences the node decision, and embedding the thermodynamic diagram and the probability into a visualized decision tree respectively to obtain a visual interpretation result.
2. The method for explaining image classification based on neural support decision tree according to claim 1, wherein said step 2) is specifically: parameters of all connection layers of CNN network Split into Personal (S) The dimension vector is used as a representative vector of the initialized leaf nodes, two similar leaf nodes are searched by using an aggregation hierarchical clustering algorithm (Agglomerative Hierarchical Clustering), and the calculation process is shown in the following formula: Wherein, the Representing a set of all initialized leaf node representative vectors, and the number of leaf nodes is I.e., the number of categories of classification tasks, Representing a set of leaf nodes Leaf node in (a) Is used for the representation of the vector of (c), Representative calculation The Euclidean distance of the vector, the above expression selects two leaf nodes which are closest to each other in the leaf nodes as left and right child nodes and possess the same father node, the father node can be determined after two similar leaf nodes are found, and the left leaf node is used And right leaf node The average value of the representative vectors is taken as the father node The method comprises the steps of clustering all father nodes continuously by using the same method, searching the grandfather nodes of similar father nodes, using the average value of representative vectors of all leaf nodes under the left subtree and the right subtree as the representative vector of the grandfather nodes, repeating the process from bottom to top until the root node to obtain a binary decision tree And each node contains its representative vector, the calculation process is as follows: Wherein, the Representing intermediate nodes All that is contained under the left and right subtrees A set of the individual leaf nodes, Representing intermediate nodes Is a representative vector of (c).
3. The method for explaining image classification based on neural support decision tree according to claim 1, wherein said step 3) is specifically: to-be-detected image subjected to pretreatment In the input CNN model, extracting the vector before CNN enters the full connection layer as the image feature, and the calculation process is shown as follows: Wherein CNN represents a convolutional neural network model after training, Representing the preprocessed tensor of the input image, Representing feature vector expression after feature extraction by CNN.
4. The method for explaining image classification based on neural support decision tree according to claim 1, wherein said step 4) is specifically: extracting image characteristic vector Inputting the neural support decision tree obtained in step 2) The probability of the root node is 1 because the entering decision tree must pass through the root node, and the representative vectors of the left and right child nodes under the root node are calculated And feature vector Calculating the selection probability of two nodes by inner product, marking the child node with larger selection probability of the two nodes as decision link node and entering the node, continuously examining the probability values of the two child nodes at the lower part, the left part and the right part, continuously performing the decision selection until the bottom leaf node, and generating a decision link from the root node to the bottom leaf node And the ith node on the link contains the predicted probability value for that node Specific probability The calculation process is as follows: Wherein the method comprises the steps of The selection probabilities of the left child node and the right child node below the entry node respectively, Representing vectors of left and right child nodes under the entry node respectively, Feature vectors of the image to be measured after the feature is extracted by CNN, For the inner product operation of the vector, Refers to softmax function calculation.

Description

Image classification interpretable method based on neural support decision tree Technical Field The invention belongs to the field of image classification, and particularly relates to an image classification interpretable method V-NBDT based on a neural support decision tree. Background Artificial intelligence in the context of deep learning plays a significant role in processing decisions related to images, text, speech, and is constantly motivating businesses and governments to build related systems around artificial intelligence in different fields. Most of artificial intelligence research at the present stage still focuses on the accuracy and complexity of model algorithms, and complex models can make accurate decisions under specific situations and even exceed human level, but the decision making process and reasons are difficult for end users to understand, so that the trust of users on the models is reduced. The end-to-end unexplainable problem prevents optimization and improvement of an intelligent decision system, and provides interpretation of decision results for users while considering model performance, so that the users can obtain trust of decision makers, and the unexplained decision maker becomes one of important directions of research in the field of artificial intelligence in recent years, namely interpretable artificial intelligence (XAI, explainable ARTIFICIAL INTELLIGENCE). Artificial intelligence interpretability can be broadly divided into two categories, pre-event and post-event interpretability. The post-mortem interpretability means that the interpretability of the model exists before the model is trained, is built in the model, such as a logistic regression model with a simple structure, a decision tree model and the like, or can be built into the whole model through autonomous design construction, such as adding an attention mechanism in the design of the model, the post-mortem interpretability means that after the model exists in the model is trained, a trained model is given, and the operation mechanism inside the model is tried by adopting different methods in a rule extraction mode, a model distillation mode and the like, so that the interpretation is given. For key fields such as national defense, law, finance, medicine and the like, the decision interpretation generated by the interpretable artificial intelligence can help a user understand the decision reason, and the occurrence of some uncontrollable risks is avoided to a certain extent. The national institute of natural science foundation management "big data driven management and decision study" guidelines indicate that important projects need to be oriented to the theory and method of management decision big data analysis, including the problem of the interpretation of artificial intelligence methods, and the like, and the importance of the field is highlighted. The convolutional neural network is widely applied to image classification tasks, and a class judgment result can be obtained after an image to be detected is input into the network. In order to enable users using the network to perceptually understand the mechanisms behind the decisions, some post-interpretable methods are inspired by the model structural features, generate saliency maps, and interpret the decisions in the form of a visualization of the critical areas of the decisions. The method using the image data as the interpretation carrier has the characteristics of local disturbance (local occlusion diagram), back propagation (scatter diagram) and activation mapping (thermodynamic diagram), and the methods have the characteristic of simplifying interpretation results, namely, only one pair of saliency diagrams is used as reference for decision interpretation, and the decision tree has a structure which is more in line with human thinking judgment logic, thereby well helping users understand the level and process of the model for decision. The image data drives the model decision, and different methods of iteration, optimization and fusion provide better explanation for the decision, so that the development of the interpretable artificial intelligence field can be promoted. The invention is inspired by two kinds of visual interpretation methods represented by an activation map and tree structure interpretation methods represented by a neural support decision tree. The CAM method is mainly characterized in that the significant decision area of a single image is utilized for interpretation, the interpretation can intuitively and clearly display the image position of interest in the classification judgment of the network in the form of a thermodynamic diagram, and the method has the biggest problem that the decision is interpreted through the thermodynamic diagram only to be excessively resultant, and the depth of interpretation and the presentation of the decision process are lacked although the image area of interest of the neural network can be reflecte