CN-122021765-A - Neural network pruning method and device for edge equipment and storage medium

CN122021765ACN 122021765 ACN122021765 ACN 122021765ACN-122021765-A

Abstract

The invention belongs to the technical field of edge calculation and deep learning, and relates to a neural network pruning method, a device and a storage medium for edge equipment, wherein a data sample is input into an initial deep convolutional neural network, a first duty ratio of a gradient value of each parameter is calculated based on the gradient value of each parameter in a training process, a second duty ratio of a target parameter value of each parameter is calculated based on the target parameter value of each parameter after training, the first duty ratio and the second duty ratio of each parameter are input into a Copula function, importance weight of each parameter is output, parameter pruning is performed based on the importance weight of each parameter, and therefore a target deep convolutional neural network is obtained and deployed to the edge equipment. The method and the device identify the parameters with the smallest influence on the model output based on the relation between the parameters and the model training gradient, ensure that the pruned model can be deployed on edge equipment, and maintain higher processing precision during data processing.

Inventors

DENG TAO
JIANG ZIHAO
JIA JUNCHENG
HUANG HE

Assignees

苏州大学

Dates

Publication Date: 20260512
Application Date: 20251205

Claims (10)

1. The neural network pruning method facing the edge equipment is characterized by comprising the following steps of: S10, inputting data samples in a training set into an initial deep convolutional neural network deployed on edge equipment, and training the initial deep convolutional neural network; S20, calculating a first duty ratio of the gradient value of each parameter based on the gradient value of each parameter in the initial deep convolutional neural network in the training process; S30, calculating a second duty ratio of a target parameter value of each parameter based on the target parameter value of each parameter in the initial deep convolutional neural network after training is completed; S40, inputting the first duty ratio and the second duty ratio of each parameter into a Copula function to calculate nonlinear correlation between each parameter and gradient thereof, and outputting importance weight of each parameter; and S50, performing parameter pruning based on importance weights of all the parameters so as to obtain a target depth convolution neural network, and deploying the target depth convolution neural network to the edge equipment.
2. The edge device-oriented neural network pruning method according to claim 1, wherein performing parameter pruning based on importance weights of respective parameters to obtain a target depth convolutional neural network comprises: Sequencing the importance weights of all the parameters from big to small, pruning the parameters corresponding to the last N importance weights, and obtaining a pruned deep convolution neural network, wherein N is more than or equal to 1; Taking the pruned deep convolutional neural network as a new initial deep convolutional neural network, and returning to the step S10 until the ratio of the sum of the number of pruned parameters to the number of parameters of the initial deep convolutional neural network before pruning the first parameter is greater than or equal to the preset pruning ratio.
3. The edge-oriented neural network pruning method of claim 1, wherein calculating a first duty cycle of the gradient value of each parameter based on the gradient values of the respective parameters in the initial deep convolutional neural network during training comprises: Constructing a parameter arrangement matrix based on the connection relation among all parameters in the initial deep convolutional neural network; And based on the gradient value of each parameter in the parameter arrangement matrix in the training process, the ratio of the gradient value of each parameter in the preset local neighborhood of the parameter to the sum of the gradient values of all the parameters in the training process is used for obtaining a first duty ratio of the gradient value of the parameter.
4. The edge device oriented neural network pruning method according to claim 3, wherein the calculation formula of the first duty ratio of each parameter is: , , Wherein, the Representing the first of the parameter arrangement matrices Line 1 A first duty cycle of the parameters of the column; Representing the first of the parameter arrangement matrices Line 1 Gradient values of the parameters of the columns in the training process; Represent the first Line 1 The sum of gradient values of all parameters in a preset local neighborhood of the listed parameters in the training process; Representing the radius of a preset local neighborhood; Representing the first of the parameter arrangement matrices Line 1 Gradient values of column parameters during training.
5. The edge-oriented neural network pruning method of claim 1, wherein calculating the second duty cycle of the target parameter value for each parameter based on the target parameter values of the respective parameters in the initial deep convolutional neural network after training is completed comprises: Constructing a parameter arrangement matrix based on the connection relation among all parameters in the initial deep convolutional neural network; And based on the target parameter value of each parameter after training in the parameter arrangement matrix, the second duty ratio of the target parameter value of the parameter is obtained, wherein the second duty ratio is the ratio of the sum of the target parameter values of all the parameters in the preset local neighborhood of the parameter after training.
6. The edge device oriented neural network pruning method of claim 5, wherein the calculation formula of the second duty ratio of each parameter is: , , Wherein, the Representing the first of the parameter arrangement matrices Line 1 A second duty cycle of the column parameter; Representing the first of the parameter arrangement matrices Line 1 The target parameter values of the parameters of the columns after training; Represent the first Line 1 The sum of target parameter values of all parameters in the preset local neighborhood of the listed parameters after training is completed; Representing the radius of a preset local neighborhood; Representing the first of the parameter arrangement matrices Line 1 The parameters of the columns are the target parameter values after training is completed.
7. The edge-oriented neural network pruning method of claim 1, wherein the importance weight of each parameter The calculation formula of (2) is as follows: , , Wherein, the Representing the parameters; gradient values representing parameters; Representing Kendall correlation coefficients; representing the integral variable.
8. The neural network pruning method facing to the edge equipment according to claim 1, wherein the edge equipment is an intelligent monitoring camera, the data sample is a face image, and the initial deep convolution neural network is deployed on the intelligent monitoring camera and is used for extracting and classifying features of the face image so as to output a recognition result; The method further comprises the step of extracting and identifying the characteristics of the face image to be identified, which is acquired by the edge equipment, by utilizing the target depth convolution neural network.
9. An edge device-oriented neural network pruning device, comprising: The model training module is used for inputting data samples in the training set into an initial deep convolutional neural network deployed on the edge equipment and training the initial deep convolutional neural network; the parameter gradient duty ratio calculation module is used for calculating a first duty ratio of a gradient value of each parameter based on the gradient value of each parameter in the initial deep convolutional neural network in the training process; The parameter value duty ratio calculation module is used for calculating a second duty ratio of the target parameter value of each parameter based on the target parameter value of each parameter in the initial deep convolutional neural network after training is completed; The importance weight calculation module is used for inputting the first duty ratio and the second duty ratio of each parameter into the Copula function to calculate the nonlinear correlation between each parameter and the gradient thereof and outputting the importance weight of each parameter; And the parameter pruning module is used for carrying out parameter pruning based on the importance weights of the parameters so as to obtain a target depth convolution neural network, and deploying the target depth convolution neural network to the edge equipment.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the edge device oriented neural network pruning method of any one of claims 1 to 8.

Description

Neural network pruning method and device for edge equipment and storage medium Technical Field The invention relates to the technical field of edge calculation and deep learning, in particular to a neural network pruning method and device facing edge equipment and a computer readable storage medium. Background With the rapid development of edge computing and deep learning technologies, more and more intelligent applications begin to rely on deploying efficient neural network models on edge devices (e.g., smartphones, internet of things devices, embedded systems, etc.). Edge computing refers to the transfer of data processing and computing tasks from a conventional centralized data center to edge devices closer to the data source, thereby reducing latency and bandwidth consumption, and deep learning is an important branch of the artificial intelligence field, which identifies and processes complex tasks, such as image recognition, speech recognition, text classification, etc., by simulating the structure of human brain neurons. However, the edge devices generally face limitations of computing power and storage resources, they cannot bear as strong computing power and mass storage as the data center, for example, the intelligent monitoring camera needs to identify faces or objects of the acquired images, so that tracking management of the people or objects is performed based on the identification results, which is crucial in various public scenes needing to be monitored, but the computing power and computing resources of the intelligent monitoring camera are limited, the neural network model for image identification must be compressed to the extent that the neural network model can run on the low-power consumption device, and meanwhile, the compression cannot be achieved at the expense of identification precision, because the monitoring system needs to ensure high accuracy to realize object identification and tracking in complex environments, so how to deploy efficient neural networks on the edge devices can maintain model performance and reduce computation amount and storage requirements, which becomes a hot problem of current research. Under the background, the neural network pruning technology has been developed, the neural network pruning is a model compression method, the scale and the calculation cost of a model are reduced mainly by removing redundant or unimportant connections in the neural network, and the calculation burden of the neural network and the storage requirement of the neural network can be reduced through pruning, so that the reasoning efficiency is improved. The existing neural network pruning method generally prunes according to the weight of the parameter, and in the method, a linear relation exists between the weight of the parameter and the model output precision, so that the parameter with a smaller weight value in the network is directly removed to reduce the model scale, but the method does not accurately identify the parameter with the smallest influence on the model output precision, and excessive pruning or insufficient pruning is often caused, so that the model output precision after pruning is still lower, and the data processing precision requirements in various data processing applications cannot be met. In summary, the existing neural network pruning method has the problems that parameters with minimal influence on the model output precision cannot be accurately identified, excessive pruning or insufficient pruning are caused, the model output precision after pruning is still low, and the data processing precision in various data processing applications is low. Disclosure of Invention Therefore, the technical problem to be solved by the invention is to solve the problems that the neural network pruning method in the prior art cannot accurately identify the parameters with the least influence on the model output precision, so that excessive pruning or insufficient pruning is caused, the model output precision after pruning is still lower, and the data processing precision in various data processing applications is low. In order to solve the technical problems, the invention provides a neural network pruning method facing edge equipment, which comprises the following steps: S10, inputting data samples in a training set into an initial deep convolutional neural network deployed on edge equipment, and training the initial deep convolutional neural network; S20, calculating a first duty ratio of the gradient value of each parameter based on the gradient value of each parameter in the initial deep convolutional neural network in the training process; S30, calculating a second duty ratio of a target parameter value of each parameter based on the target parameter value of each parameter in the initial deep convolutional neural network after training is completed; S40, inputting the first duty ratio and the second duty ratio of each parameter into a Copula function to calculate no