Fast Cnn Pruning Via Redundancy-Aware Training

Xiao Dong,Lei Liu,Guangli Li,Peng Zhao,Xiaobing Feng
DOI: https://doi.org/10.1007/978-3-030-01418-6_1
2018-01-01
Abstract:The heavy storage and computational overheads have become a hindrance to the deployment of modern Convolutional Neural Networks (CNNs). To overcome this drawback, many works have been proposed to exploit redundancy within CNNs. However, most of them work as post-training processes. They start from pre-trained dense models and apply compression and extra fine-tuning. The overall process is time-consuming. In this paper, we introduce redundancy-aware training, an approach to learn sparse CNNs from scratch with no need for any post-training compression procedure. In addition to minimizing training loss, redundancy-aware training prunes unimportant weights for sparse structures in the training phase. To ensure stability, a stage-wise pruning procedure is adopted, which is based on carefully designed model partition strategies. Experiment results show redundancy-aware training can compress LeNet-5, ResNet-56 and AlexNet by a factor of 43.8x, 7.9x and 6.4x, respectively. Compared to state-of-the-art approaches, our method achieves similar or higher sparsity while consuming significantly less time, e.g., 2.3x-18x more efficient in terms of time.
What problem does this paper attempt to address?