CN-121982410-A - Convolutional neural network self-adaptive structured pruning method, storage medium and device for end-side device image classification
Abstract
The invention discloses a convolutional neural network self-adaptive structured pruning method, a storage medium and equipment for end-side equipment image classification, and belongs to the technical field of image processing, wherein the method mainly comprises the steps of constructing an index-oriented sparse index based on a sparse linear approximation theory, and iteratively executing multiple rounds of unstructured pruning and recovery training on a pre-training model; after the iteration is finished, scanning redundant parameters and invalid convolution kernels in the current image classification model, physically deleting the invalid convolution kernels from the memory, and reconstructing a calculation map of the image classification model to finish structured pruning and network reconstruction. According to the invention, the importance of each convolution layer in image feature extraction is automatically evaluated by constructing an index-oriented sparse index, the reserved quantity of weight parameters of each layer is dynamically calculated, an iterative pruning-recovery training mechanism is adopted, unstructured sparsification is gradually realized, the image classification precision can be maintained under high compression rate, full-self-adaptive pruning is realized, and the method is suitable for a resource-limited scene.
Inventors
- YANG GUOWU
- FANG TIANSHENG
- HUANG XIN
- XIE ZHEN
- Huang Zongmo
Assignees
- 电子科技大学
Dates
- Publication Date
- 20260505
- Application Date
- 20260128
Claims (6)
- 1. The convolutional neural network self-adaptive structured pruning method for the end-side equipment image classification is characterized by comprising the following steps of: s1, loading an original image classification model and training to obtain a pre-training model, wherein the pre-training model comprises a plurality of convolution layers for extracting image features; S2, constructing an index-oriented sparse index based on a sparse linear approximation theory, and iteratively executing multiple rounds of unstructured pruning and recovery training on the pre-training model, wherein in each round of iteration, the unstructured pruning comprises: Calculating the reserved quantity of the weight parameters of each convolution layer according to the sparse index; setting a retention threshold, setting a weight parameter with an absolute value smaller than the retention threshold to zero, and sparsifying the pre-training model; The recovery training includes: Performing fine tuning training on the sparse pre-training model, and updating the residual non-zero weight parameters through a back propagation algorithm to recover the expression capability of the model on the image characteristics; S3, after iteration is finished, scanning redundant parameters and invalid convolution kernels in the current image classification model, physically deleting the invalid convolution kernels from a memory, and reconstructing a calculation chart of the image classification model to finish structured pruning and network reconstruction; s4, outputting a lightweight image classification model, and disposing the lightweight image classification model in end-side equipment to execute real-time image acquisition and classification tasks.
- 2. The convolutional neural network adaptive structured pruning method for end-side device image classification of claim 1, wherein a masking mechanism is employed to record zeroed weight parameters during the iterative process, and gradient computation and update operations of corresponding weight parameters are prevented during the recovery training.
- 3. The convolutional neural network adaptive structured pruning method for end-side device image classification of claim 1, wherein the sparse index is calculated according to the formula: Wherein, the Representing the sparse index of the t-th round of iteration, And (3) representing a weight vector formed by the rest weight parameters in the previous iteration, wherein q is an adjustable super parameter for controlling pruning intensity.
- 4. A convolutional neural network adaptive structured pruning method for end-side device image classification as defined in claim 3, wherein the reserved number of weight parameters for each convolutional layer is calculated by the following equation: Wherein, the The representation is rounded down and up, Representing the reserved number of weight parameters.
- 5. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements a convolutional neural network adaptive structured pruning method for end-side device image classification according to any one of claims 1-4.
- 6. An electronic device comprising a memory and a processor, the memory having stored thereon computer instructions executable on the processor, wherein the processor, when executing the computer instructions, performs an end-side device image classification oriented convolutional neural network adaptive structured pruning method as defined in any one of claims 1-4.
Description
Convolutional neural network self-adaptive structured pruning method, storage medium and device for end-side device image classification Technical Field The invention relates to the technical field of image processing, in particular to a convolutional neural network self-adaptive structured pruning method, a storage medium and equipment for end-side equipment image classification. Background In recent years, convolutional Neural Networks (CNNs) have gained excellent success in a number of fields such as image recognition, natural language processing, etc., and typical deep CNN models (e.g., VGG-16 and ResNet-50) extract complex image features by stacking a large number of convolutional layers. However, these high performance models are large in parameter amount and high in computational complexity, resulting in extremely high demands on storage space and computational resources. This severely limits the CNN model from implementing real-time deployment on resource-constrained mobile devices (e.g., smartphones), embedded vision systems (e.g., autopilot cameras, drones), or internet of things terminals. To address this problem, model compression techniques have evolved, where pruning is an important means of reducing model redundancy. Pruning techniques can be categorized according to their granularity into unstructured prunes (Unstructured Sparse) and structured prunes (Structured Sparse). Structured pruning achieves model compression by removing larger granularity building blocks in the network, such as convolution kernels, channels, and even layers. Unlike unstructured pruning, structured pruning does not destroy the dense structure of the original tensor while removing parameters, so that the existing deep learning framework and general hardware can be directly utilized to realize computation acceleration without special hardware or library support. Thus, structured pruning is considered one of the key technologies to achieve efficient model deployment. The implementation of structured pruning techniques generally follows a set of established procedures, the core of which is to evaluate the redundancy of the structural units in the network and to perform the removal operation. The prior art can be generalized to two general forms, depending on the iterative strategy it implements. The first is a One-time evaluation and pruning based procedure (One-shot Pruning). The method pursues efficiency, and firstly, importance evaluation is carried out on all structural units to be pruned in the pre-training model. The evaluation criteria typically depend on an easy-to-calculate metric, such as calculating the L1 or L2 norms of the weights of the units, or directly based on the magnitude of the weight values. After the evaluation is completed, determining a corresponding pruning threshold according to a preset pruning rate. Subsequently, all building blocks with importance scores below this threshold are removed in one go. The second is a process (ITERATIVE PRUNING) based on iterative or loop optimization. In order to better maintain model accuracy at high compression rates, many prior art techniques employ a "pruning-fine tuning" loop optimization mechanism. In this flow, the model is not pruned a large amount at a time, but only a small portion of the current least significant building blocks are removed during each iteration cycle. After each small-amplitude pruning operation is completed, the model must be subjected to a sufficient fine tuning so that the model weights can adapt to the new sparse structure and converge. Subsequently, the next round of importance assessment, pruning and fine tuning is repeated based on the current fine-tuned model. This process continues until a preset total target pruning rate is reached or a specific model performance index is met. Although structured pruning technology has great potential in terms of model compression and acceleration, the existing structured pruning method facing the image classification model still faces some problems to be solved urgently: 1. The trade-off between accuracy loss and compressibility is difficult. Most existing pruning methods have difficulty in maintaining satisfactory model accuracy at high compression rates. Excessive pruning often results in a significant reduction in model expression and impairment of generalization, especially when complex images (e.g., traffic signs, faces, natural scenes) are processed, with serious loss of detail features. 2. The limitations of the redundancy quantization standard. Existing pruning criteria, such as metrics based on weight L1/L2 norms or activation values, often only can roughly assess the importance of a building block. These simple metrics have difficulty accurately capturing the true contribution of a building block to the final model classification performance, resulting in important structures being erroneously removed, affecting model performance. 3. Dependency problem between different structure