Abstract:Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to assign different pruning rates across different layers in CNN or cannot control the compression rate explicitly. Since too narrow network blocks information flow for training, automatic pruning rate setting cannot explore a high pruning rate for a specific layer. To overcome these limitations, we propose a novel framework named Layer Adaptive Progressive Pruning (LAPP), which gradually compresses the network during initial training of a few epochs from scratch. In particular, LAPP designs an effective and efficient pruning strategy that introduces a learnable threshold for each layer and FLOPs constraints for network. Guided by both task loss and FLOPs constraints, the learnable thresholds are dynamically and gradually updated to accommodate changes of importance scores during training. Therefore the pruning strategy can gradually prune the network and automatically determine the appropriate pruning rates for each layer. What's more, in order to maintain the expressive power of the pruned layer, before training starts, we introduce an additional lightweight bypass for each convolutional layer to be pruned, which only adds relatively few additional burdens. Our method demonstrates superior performance gains over previous compression methods on various datasets and backbone architectures. For example, on CIFAR-10, our method compresses ResNet-20 to 40.3% without accuracy drop. 55.6% of FLOPs of ResNet-18 are reduced with 0.21% top-1 accuracy increase and 0.40% top-5 accuracy increase on ImageNet.

Structured Feature Sparsity Training for Convolutional Neural Network Compression

Structured Pruning for Efficient Convolutional Neural Networks Via Incremental Regularization

Pruning by Training: A Novel Deep Neural Network Compression Framework for Image Processing.

Structured Deep Neural Network Pruning by Varying Regularization Parameters.

Structured Probabilistic Pruning for Convolutional Neural Network Acceleration.

SUBP: Soft Uniform Block Pruning for 1 X N Sparse CNNs Multithreading Acceleration

Adversarial Structured Neural Network Pruning

Adaptive Structured Sparse Network for Efficient CNNs with Feature Regularization.

Efficient Network Compression Through Smooth-Lasso Constraint

FSCNN: A Fast Sparse Convolution Neural Network Inference System

A Pruning Method Based on the Dissimilarity of Angle among Channels and Filters

Adaptive Structured Sparse Network for Efficient CNNs with Feature Regularization

Pushing the Efficiency Limit Using Structured Sparse Convolutions

Prune the Convolutional Neural Networks with Sparse Shrink

Structured Pruning is All You Need for Pruning CNNs at Initialization

DTS: Dynamic Training Slimming with Feature Sparsity for Efficient Convolutional Neural Network

Structured Term Pruning for Computational Efficient Neural Networks Inference

Learning Structured Sparsity in Deep Neural Networks

LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch

Where to Prune: Using LSTM to Guide Data-Dependent Soft Pruning

SparseTrain: Exploiting Dataflow Sparsity for Efficient Convolutional Neural Networks Training