Abstract:Deep convolutional neural networks (CNNs) have achieved tremendous successes but tend to suffer from high computation costs mainly due to heavy over-parameterization, resulting in the difficulty of directly applying them to the ever-growing application demands based on low-end edge devices with strong power restriction and real-time inference requirement. Recently, there has much research attention devoted to compressing the network via pruning to address this issue. Most of the existing methods rely on some hand-designed pruning rules, which suffer from several limitations. Firstly, manually designed rules are only applicable to limited application scenarios, which can hardly generalize well in a broader scope. And these rules are typically designed based on human experience and via trial and error, and thus highly subjective. Then, channels of different layers in a network may have diverse distributions, which means the same pruning rule is not appropriate for each layer. To address these limitations, we propose a novel channel pruning scheme, in which the task-irrelevant channels are removed in a task-driven manner. Specifically, an adaptively differentiable search module is proposed to find the best pruning rule automatically for different layers in CNNs under sparsity constraints. Besides, we employed knowledge distillation to alleviate the excessive performance loss. Once the training process is finished, a compact network will be obtained by removing channels based on layer-wise pruning rules. We have evaluated the proposed method on some well-known benchmark datasets including CIFAR, MNIST, and ImageNet in comparison to several state-of-the-art pruning methods. Experimental results demonstrate the superiority of our method over the compared ones in terms of both parameters and FLOPs reduction.

Students and teachers learning together: a robust training strategy for neural network pruning

Pruning by Training: A Novel Deep Neural Network Compression Framework for Image Processing.

Loss Constrains Added Squeeze and Excitation Blocks for Pruning Deep Neural Networks

Class-Aware Pruning for Efficient Neural Networks

Structured Pruning for Efficient Convolutional Neural Networks Via Incremental Regularization

SUBP: Soft Uniform Block Pruning for 1 X N Sparse CNNs Multithreading Acceleration

A Pruning Method Based on the Dissimilarity of Angle among Channels and Filters

Adversarial Structured Neural Network Pruning

Efficient Network Compression Through Smooth-Lasso Constraint

Transfer Knowledge for High Sparsity in Deep Neural Networks

Sparse optimization guided pruning for neural networks

A Dynamic Pruning Method on Multiple Sparse Structures in Deep Neural Networks

Learning Low Resource Consumption CNN through Pruning and Quantization

Prune the Convolutional Neural Networks with Sparse Shrink

Neural Network Pruning with Residual-Connections and Limited-Data

Adaptive Search-and-Training for Robust and Efficient Network Pruning

Accelerating Convolutional Neural Networks By Group-Wise 2d-Filter Pruning

An Automatically Layer-wise Searching Strategy for Channel Pruning Based on Task-driven Sparsity Optimization

Learning to Prune in Training Via Dynamic Channel Propagation

Where to Prune: Using LSTM to Guide Data-Dependent Soft Pruning

Pruning the Deep Neural Network by Similar Function