Abstract:Channel pruning is a promising method for accelerating and compressing convolutional neural networks. However, current pruning algorithms still remain unsolved problems that how to assign layer-wise pruning ratios properly and discard the least important channels with a convincing criterion. In this paper, we present a novel channel pruning approach via information theory and interpretability of neural networks. Specifically, we regard information entropy as the expected amount of information for convolutional layers. In addition, if we suppose a matrix as a system of linear equations, a higher-rank matrix represents there exist more solutions to it, which indicates more uncertainty. From the point of view of information theory, the rank can also describe the amount of information. In a neural network, considering the rank and entropy as two information indicators of convolutional layers, we propose a fusion function to reach a compromise of them, where the fusion results are defined as ``information concentration''. When pre-defining layer-wise pruning ratios, we employ the information concentration as a reference instead of heuristic and engineering tuning to provide a more interpretable solution. Moreover, we leverage Shapley values, which are a potent tool in the interpretability of neural networks, to evaluate the channel contributions and discard the least important channels for model compression while maintaining its performance. Extensive experiments demonstrate the effectiveness and promising performance of our method. For example, our method improves the accuracy by 0.21% when reducing 45.5% FLOPs and removing 40.3% parameters for ResNet-56 on CIFAR-10. Moreover, our method obtains loss in Top-1/Top-5 accuracies of 0.43%/0.11% by reducing 41.6% FLOPs and removing 35.0% parameters for ResNet-50 on ImageNet.

ARPruning: An automatic channel pruning based on attention map ranking

Loss Constrains Added Squeeze and Excitation Blocks for Pruning Deep Neural Networks

Structured Pruning for Efficient Convolutional Neural Networks Via Incremental Regularization

Class-Aware Pruning for Efficient Neural Networks

Towards Efficient Filter Pruning Via Adaptive Automatic Structure Search

Pruning by Training: A Novel Deep Neural Network Compression Framework for Image Processing.

A Pruning Method Based on the Dissimilarity of Angle among Channels and Filters

AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance

An Automatically Layer-wise Searching Strategy for Channel Pruning Based on Task-driven Sparsity Optimization

Channel Pruning via Automatic Structure Search

AACP: Model Compression by Accurate and Automatic Channel Pruning.

Adversarial Structured Neural Network Pruning

LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch

Conditional Automated Channel Pruning for Deep Neural Networks

RL-Pruner: Structured Pruning Using Reinforcement Learning for CNN Compression and Acceleration

Filter Pruning Via Feature Map Clustering.

Pruning with Compensation: Efficient Channel Pruning for Deep Convolutional Neural Networks

An Effective Information Theoretic Framework for Channel Pruning

Three-Stage Global Channel Pruning for Resources-Limited Platform

ABCP: Automatic Block-wise and Channel-wise Network Pruning via Joint Search

Single-shot Channel Pruning Based on Alternating Direction Method of Multipliers