CN-116992941-B - Convolutional neural network pruning method and device based on feature similarity and feature compensation

CN116992941BCN 116992941 BCN116992941 BCN 116992941BCN-116992941-B

Abstract

The invention discloses a convolutional neural network pruning method and device based on feature similarity and feature compensation. The method comprises the steps of adjusting the size of an input image in a dataset to be a fixed size, carrying out standardization processing on pixel values of the image, adding training data through an image enhancement technology, initializing a network structure, setting model parameters, training a model through the training data, obtaining similarity among convolution kernels according to the trained model, carrying out clustering analysis according to the similarity, carrying out similar grouping on each layer of convolution kernels, selecting a reserved convolution kernel in each similar grouping to generate a new network structure, copying the reserved convolution kernel parameters into the new network structure, compensating weight parameters of the pruned convolution kernels in each similar grouping into the reserved convolution kernels in a parameter superposition mode, carrying out model accuracy restorative training through an original dataset, and storing the model parameters and the network structure. The pruning method can maintain the model precision.

Inventors

TANG BIN
WANG QIANG

Assignees

河海大学

Dates

Publication Date: 20260505
Application Date: 20230817

Claims (7)

1. The convolutional neural network pruning method based on feature similarity and feature compensation is characterized by comprising the following steps of: The size of an input image in a data set is adjusted to be a fixed size, the pixel value of the image is subjected to standardized processing, and training data of an original data set is increased through an image enhancement technology; initializing a network structure, setting model parameters, and training the model by using training data; Aiming at the trained models, obtaining similarity among convolution kernels, carrying out cluster analysis according to the similarity, carrying out similar grouping on each layer of convolution kernels, and selecting a reserved convolution kernel from each similar grouping to generate a new network structure, wherein obtaining the similarity among the convolution kernels comprises stretching three-dimensional tensors of each convolution kernel into one-dimensional tensors, and calculating cosine similarity among the one-dimensional tensors; Copying the reserved convolution kernel parameters into a new network structure, and compensating the weight parameters of the pruned convolution kernels in each similar group into the reserved convolution kernels in a parameter superposition mode, wherein the parameter superposition mode is that the weight parameters of each group of pruned convolution kernels are shared to the reserved convolution kernels, and the sharing refers to superposition of tensors of corresponding positions of the convolution kernels in the group as convolution kernels after sharing parameters; And (5) performing model accuracy restorative training by using the original data set, and storing model parameters and a network structure.
2. The method of claim 1, wherein the tensor stretching process is expressed as: ; Wherein, the Is the jth convolution kernel in the ith layer, which consists of Personal (S) Is stretched to form a one-dimensional tensor , Representing the number of input feature graphs of the i-th layer, Representing the number of i-th layer output feature graphs, And (3) with Representing the height and width of the two-dimensional tensor in each three-dimensional tensor, respectively.
3. The method of claim 1, wherein similarly grouping each layer of convolution kernels comprises similarly grouping each layer of convolution kernels using kmeans cluster analysis algorithm for a neural network having L layers of convolution kernels, the number of groups being defined by a pruning rate set by each layer Determining, wherein the number of packets of the i-th layer convolution kernel is: wherein , Representing the number of i-th layer output feature graphs.
4. The method of claim 1, wherein generating the new network structure comprises setting a subscript for each convolution kernel of each layer, saving the subscript of the reserved convolution kernel in a mask array, and generating the new network structure from the mask array, wherein the mask array is represented as: ; Wherein the method comprises the steps of A mask array representing an i-th layer, 1 representing that a j-th convolution kernel in the i-th layer is reserved, 0 representing that a j-th convolution kernel in the i-th layer is pruned, Representing the set of retained convolution kernels, Representing a collection of pruned convolution kernels.
5. A convolutional neural network pruning device based on feature similarity and feature compensation, comprising: the data preprocessing module is configured to adjust the size of an input image in the dataset to be a fixed size, perform standardization processing on pixel values of the image, and increase training data of the original dataset through an image enhancement technology; The model training module is configured to initialize a network structure and set model parameters, and training the model by using training data; The network pruning module is configured to acquire similarity among convolution kernels according to the trained models, perform cluster analysis according to the similarity, perform similar grouping on each layer of convolution kernels, select reserved convolution kernels from each similar grouping, and generate a new network structure, wherein the acquisition of the similarity among the convolution kernels comprises the steps of stretching three-dimensional tensors of each convolution kernel into one-dimensional tensors, calculating cosine similarity among the one-dimensional tensors, selecting reserved convolution kernels from each similar grouping comprises the steps of calculating the sum of absolute values of weight parameters of each convolution kernel by utilizing a one-dimensional tensor stretching result, and selecting the convolution kernel with the largest sum of absolute values of weight parameters in each similar grouping as the reserved convolution kernel; The parameter compensation module is configured to copy the reserved convolution kernel parameters into a new network structure, and compensate the weight parameters of the pruned convolution kernels in each similar group into the reserved convolution kernels in a parameter superposition mode, wherein the parameter superposition mode is that the weight parameters of each group of pruned convolution kernels are shared to the reserved convolution kernels, and the sharing refers to superposition of tensors of corresponding positions of the convolution kernels in the group as convolution kernels after sharing parameters; and the precision recovery module is configured for performing model precision restorability training by utilizing the original data set and storing model parameters and a network structure.
6. A computer device comprising one or more processors, memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, which when executed by the processor implement the steps of the convolutional neural network pruning method based on feature similarity and feature compensation of any one of claims 1-4.
7. A computer storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the convolutional neural network pruning method based on feature similarity and feature compensation as claimed in any one of claims 1-4.

Description

Convolutional neural network pruning method and device based on feature similarity and feature compensation Technical Field The invention relates to the field of computer vision and model compression, in particular to a convolutional neural network pruning method and device based on feature similarity and feature compensation. Background As machine computing power continues to increase, convolutional neural networks (Convolutional neural networks, CNNs) have made significant progress in the field of computer vision. Such as image recognition, object detection, and image segmentation. However, the performance of CNNs models is closely related to its complexity. For best results in various computer vision tasks, it is often necessary to use deeper and wider networks, resulting in a large floating point number calculation and parameter count. For computing tasks requiring real-time and privacy, it is necessary to directly perform model reasoning on the edge device, but it is often difficult for the resource-constrained edge device to deploy a network model with a larger scale. Therefore, reducing the number of model floating point numbers and parameters is of great significance to edge device deployment. In order to overcome the problem that the edge equipment is difficult to deploy, a network pruning method is adopted in many application scenes. For huge floating point number calculation amount and parameter amount, weight parameters or network structures with smaller influence on model precision in the network can be pruned. The network pruning scheme is divided into unstructured pruning and structured pruning, and the unstructured pruning has the advantages that the unstructured pruning is sparse in weight matrix, so that the effects of compression and acceleration calculation are difficult to achieve under the condition of no special calculation hardware. Structured pruning compresses the network size by pruning the network structure, such as convolution kernels, convolution layers, etc., the original convolution structure of the model remains after pruning. The structured pruning scheme mainly prunes the convolution kernels with the importance index lower than the threshold value by setting the importance index for each convolution kernel or channel and globally sets the threshold value, and the pruned model has obvious compression effect on scale compared with the original model, so that the problem of directly deploying the network model by the edge equipment is solved to a certain extent. Most of the structuralized pruning schemes are mainly focused on the design of convolution kernels or channel importance indexes, and the similarity of the characteristics of the convolution kernels or the channels is ignored. In the reasoning process of the convolutional neural network, the output feature graphs of different convolutional kernels have similarity, and the convolutional kernels with similar behaviors in the pruning network can reduce the loss of model precision under the condition of compressing the model scale. In addition, in the case of conventional training of the model, the pruned network parameters contain more characteristic information, and this part of network parameters can play a positive correlation role in model accuracy, and in the process of network pruning, this part of characteristic information needs to be transferred to the reserved network structure by means of characteristic compensation. Therefore, it is necessary for structured pruning to consider feature similarity and feature compensation. Disclosure of Invention Aiming at the defects of the prior art, the invention provides the convolutional neural network pruning method and device based on feature similarity and feature compensation, which can reduce the loss of model precision under the condition of reducing the calculated quantity and the parameter quantity of the model floating point number, thereby enabling the model to be directly deployed at edge equipment, improving the real-time performance of reasoning and protecting privacy safety. Technical scheme in order to achieve the purpose of the invention, the technical scheme of the invention is as follows: a convolutional neural network pruning method based on feature similarity and feature compensation comprises the following steps: The size of an input image in a data set is adjusted to be a fixed size, the pixel value of the image is subjected to standardized processing, and training data of an original data set is increased through an image enhancement technology; initializing a network structure, setting model parameters, and training the model by using training data; Aiming at the trained model, obtaining the similarity between convolution kernels, carrying out cluster analysis according to the similarity, carrying out similar grouping on each layer of convolution kernels, selecting the reserved convolution kernels from each similar grouping, and generating a new net