Abstract:The success of convolutional neural networks (CNNs) in computer vision applications has been accompanied by a significant increase of computation and memory costs, which prohibits their usage on resource-limited environments, such as mobile systems or embedded devices. To this end, the research of CNN compression has recently become emerging. In this paper, we propose a novel filter pruning scheme, termed structured sparsity regularization (SSR), to simultaneously speed up the computation and reduce the memory overhead of CNNs, which can be well supported by various off-the-shelf deep learning libraries. Concretely, the proposed scheme incorporates two different regularizers of structured sparsity into the original objective function of filter pruning, which fully coordinates the global output and local pruning operations to adaptively prune filters. We further propose an alternative updating with Lagrange multipliers (AULM) scheme to efficiently solve its optimization. AULM follows the principle of alternating direction method of multipliers (ADMM) and alternates between promoting the structured sparsity of CNNs and optimizing the recognition loss, which leads to a very efficient solver (2.5x to the most recent work that directly solves the group sparsity-based regularization). Moreover, by imposing the structured sparsity, the online inference is extremely memory-light since the number of filters and the output feature maps are simultaneously reduced. The proposed scheme has been deployed to a variety of state-of-the-art CNN structures, including LeNet, AlexNet, VGGNet, ResNet, and GoogLeNet, over different data sets. Quantitative results demonstrate that the proposed scheme achieves superior performance over the state-of-the-art methods. We further demonstrate the proposed compression scheme for the task of transfer learning, including domain adaptation and object detection, which also show exciting performance gains over the state-of-the-art filter pruning methods.

Learning Structured Sparsity in Deep Neural Networks

Efficient Structure Slimming for Spiking Neural Networks

Structured Pruning for Efficient Convolutional Neural Networks Via Incremental Regularization

SUBP: Soft Uniform Block Pruning for 1xn Sparse CNNs Multithreading Acceleration

Performance of Training Sparse Deep Neural Networks on GPUs

SUBP: Soft Uniform Block Pruning for 1 X N Sparse CNNs Multithreading Acceleration

Adaptive Structured Sparse Network for Efficient CNNs with Feature Regularization.

Learning Sparse Patterns in Deep Neural Networks

Adaptive Structured Sparse Network for Efficient CNNs with Feature Regularization

Learning Low-Rank Structured Sparsity in Recurrent Neural Networks

Structured Sparsity Learning for Efficient Video Super-Resolution

Structured Feature Sparsity Training for Convolutional Neural Network Compression

Learning k-Level Structured Sparse Neural Networks Using Group Envelope Regularization

Adaptive Pixel-wise Structured Sparse Network for Efficient CNNs

Toward Compact ConvNets via Structure-Sparsity Regularized Filter Pruning

Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity

Exploring the Regularity of Sparse Structure in Convolutional Neural Networks

Regularizing Deep Convolutional Neural Networks with a Structured Decorrelation Constraint.

StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs

Sparsing Deep Neural Network Using Semi-Discrete Matrix Decomposition

Dynamic Sparse Training with Structured Sparsity