Abstract:Since sparse neural networks usually contain many zero weights, these unnecessary network connections can potentially be eliminated without degrading network performance. Therefore, well-designed sparse neural networks have the potential to significantly reduce FLOPs and computational resources. In this work, we propose a new automatic pruning method - Sparse Connectivity Learning (SCL). Specifically, a weight is re-parameterized as an element-wise multiplication of a trainable weight variable and a binary mask. Thus, network connectivity is fully described by the binary mask, which is modulated by a unit step function. We theoretically prove the fundamental principle of using a straight-through estimator (STE) for network pruning. This principle is that the proxy gradients of STE should be positive, ensuring that mask variables converge at their minima. After finding Leaky ReLU, Softplus, and Identity STEs can satisfy this principle, we propose to adopt Identity STE in SCL for discrete mask relaxation. We find that mask gradients of different features are very unbalanced, hence, we propose to normalize mask gradients of each feature to optimize mask variable training. In order to automatically train sparse masks, we include the total number of network connections as a regularization term in our objective function. As SCL does not require pruning criteria or hyper-parameters defined by designers for network layers, the network is explored in a larger hypothesis space to achieve optimized sparse connectivity for the best performance. SCL overcomes the limitations of existing automatic pruning methods. Experimental results demonstrate that SCL can automatically learn and select important network connections for various baseline network structures. Deep learning models trained by SCL outperform the SOTA human-designed and automatic pruning methods in sparsity, accuracy, and FLOPs reduction.

Learning the number of nodes in DNNs with activation mask.

Class-Aware Pruning for Efficient Neural Networks

Efficient Structure Slimming for Spiking Neural Networks

Learning Compact Representations of Neural Networks using DiscriminAtive Masking (DAM)

Reshaping deep neural network for fast decoding by node-pruning

Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error

Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures

Learning both Weights and Connections for Efficient Neural Network

Neuron Sparseness Versus Connection Sparseness in Deep Neural Network for Large Vocabulary Speech Recognition

Automatic Sparse Connectivity Learning for Neural Networks

Energy-Efficient Deep Neural Network Optimization Via Pooling-Based Input Masking

Identifying and Pruning Redundant Structures for Deep Neural Networks

Learning Optimized Structure of Neural Networks by Hidden Node Pruning with L1 Regularization

Towards Scalable and Deep Graph Neural Networks via Noise Masking

Compressing Deep Neural Network for Facial Landmarks Detection

Efficient Deep Structure Learning for Resource-Limited IoT Devices

Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking

Neural network relief: a pruning algorithm based on neural activity

Accelerating DCNNs via Cooperative Weight/Activation Compression

An efficient pruning scheme of deep neural networks for Internet of Things applications

A roulette wheel-based pruning method to simplify cumbersome deep neural networks