Abstract:To reduce computational overhead while maintaining model performance, model pruning techniques have been proposed. Among these, structured pruning, which removes entire convolutional channels or layers, significantly enhances computational efficiency and is compatible with hardware acceleration. However, existing pruning methods that rely solely on image features or gradients often result in the retention of redundant channels, negatively impacting inference efficiency. To address this issue, this paper introduces a novel pruning method called Feature-Gradient Pruning (FGP). This approach integrates both feature-based and gradient-based information to more effectively evaluate the importance of channels across various target classes, enabling a more accurate identification of channels that are critical to model performance. Experimental results demonstrate that the proposed method improves both model compactness and practicality while maintaining stable performance. Experiments conducted across multiple tasks and datasets show that FGP significantly reduces computational costs and minimizes accuracy loss compared to existing methods, highlighting its effectiveness in optimizing pruning outcomes. The source code is available at: <a class="link-external link-https" href="https://github.com/FGP-code/FGP" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the convolutional neural network (CNN) pruning technique that maintains model performance while reducing computational overhead. Specifically, existing pruning methods are mainly divided into two types: feature - based pruning and gradient - based pruning, but these methods often have the following problems: 1. **Feature - based pruning**: Such methods usually evaluate the importance of channels by measuring the magnitude of convolutional layer weights (such as the L1 norm) or the γ value in the batch normalization layer. Although it is computationally efficient and simple, it may be insufficient in capturing complex relationships between channels, and it is easy to retain redundant channels, affecting inference efficiency. 2. **Gradient - based pruning**: Such methods evaluate the contribution of channels to task optimization by analyzing gradient information (such as RFC or DMCP). They reflect the actual impact of channels during the training process, but often lack an overall understanding of channel characteristics, and are prone to ignoring the feature representation of channels, resulting in an inability to maintain a good feature distribution and affecting model stability. To overcome the above problems, this paper proposes a new pruning method - Feature - Gradient Pruning (FGP). The FGP method combines feature information and gradient information to more effectively evaluate the importance of channels for different target categories, thereby more accurately identifying channels that are crucial for model performance. In this way, FGP can reduce computational costs while minimizing accuracy loss and improving the compactness and practicality of the model. ### Main contributions 1. **Combination of feature and gradient information**: FGP combines feature and gradient information to evaluate the importance of channels, solving the deficiencies of existing pruning methods. By integrating these two types of information, FGP can more accurately identify channels that are crucial for model performance, thereby improving model efficiency. 2. **Fine - grained channel selection**: FGP refines the channel selection process, making pruning more precise. It retains channels that are important for all categories while removing redundant channels that are only useful for specific categories. This more refined pruning method enhances the compactness and practicality of the model. 3. **Top - k strategy based on support value concentration**: FGP uses a Top - k strategy based on the concentration of support values for different categories to select channels and identify the channels that are most important for all categories. The pruning ratio is not fixed but is dynamically adjusted according to the support values of the channels. ### Experimental results The experimental results show that the FGP method significantly reduces computational costs and minimizes accuracy loss on multiple tasks and datasets, demonstrating its effectiveness in optimizing the pruning effect. Specific experimental results include: - **Image classification tasks**: Experiments were carried out on the CIFAR - 10 and CIFAR - 100 datasets using the VGG - 16 and ResNet - 50 models. The FGP method maintained a high classification accuracy while reducing the number of FLOPs and parameters. - **Image segmentation tasks**: Experiments were carried out on the CamVid and Cityscapes datasets using SegNet and ResNet - 50 as the backbone networks. The FGP method maintained a high mIOU value while reducing the number of FLOPs and parameters. ### Parameter analysis - **Top - k value study**: Through experiments on the CIFAR - 10 dataset using the VGGNet - 16 architecture, it was found that when the Top - k value is between 0.35 and 0.4, the model achieves the best balance between accuracy and inference speed. - **Number of categories study**: By controlling the number of categories in the dataset (from 5 to 100), it was found that as the number of categories increases, the accuracy gap between the pruned model and the baseline model gradually increases. This indicates that in tasks with a small number of categories, the FGP method can more effectively retain channels with high support values, achieving model compression and inference acceleration while maintaining a high accuracy close to the baseline. In summary, this paper effectively solves the deficiencies of existing pruning methods in reducing computational overhead and maintaining model performance by proposing the FGP method, providing new ideas for efficient pruning of deep learning models.

FGP: Feature-Gradient-Prune for Efficient Convolutional Layer Pruning

Pruning by Training: A Novel Deep Neural Network Compression Framework for Image Processing.

Adding Before Pruning: Sparse Filter Fusion for Deep Convolutional Neural Networks Via Auxiliary Attention

A Pruning Method Based on the Dissimilarity of Angle among Channels and Filters

Channel Pruning Based on Mean Gradient for Accelerating Convolutional Neural Networks

Pruning Deep Convolutional Neural Networks via Gradient Support Pursuit

GAP: A Group-based Automatic Pruning Algorithm Via Convolution Kernel Fusion

Generalized Gradient Flow Based Saliency for Pruning Deep Convolutional Neural Networks

Optimization Based Layer-Wise Pruning Threshold Method for Accelerating Convolutional Neural Networks

Layer Pruning via Fusible Residual Convolutional Block for Deep Neural Networks

Where to Prune: Using LSTM to Guide Data-Dependent Soft Pruning

Efficient Structured Pruning Based on Deep Feature Stabilization

FGGP: Fixed-Rate Gradient-First Gradual Pruning

Global balanced iterative pruning for efficient convolutional neural networks

LOCP: Latency-optimized channel pruning for CNN inference acceleration on GPUs

NFP: A No Fine-tuning Pruning Approach for Convolutional Neural Network Compression

Convolution Network Pruning Based on the Evaluation of the Importance of Characteristic Attributions

Latency-aware Automatic CNN Channel Pruning with GPU Runtime Analysis

CCPrune: Collaborative Channel Pruning for Learning Compact Convolutional Networks

Accelerating Convolutional Networks via Global & Dynamic Filter Pruning

Filter Pruning Via Geometric Median for Deep Convolutional Neural Networks Acceleration