Abstract:While previous network compression methods achieve great success, most of them rely on the abundant training data which is, unfortunately, often unavailable in practice due to some reasons, e.g., privacy issues, storage constraints, and transmission limitations. A promising way to solve this problem is to perform compression with a few unlabeled data. Proceeding along this way, we propose a novel few-shot network compression framework named Few-Shot Slimming (FSS). FSS follows the student/teacher paradigm, and contains two steps: (1) construct the student by inheriting principal feature maps from the teacher; (2) refine the student feature representation by knowledge distillation with an enhanced mixing data augmentation method called GridMix. Specifically, in the first step, we employ normalized cross correlation to perform the principal feature analysis, and then theoretically construct a new indicator to select the most informative feature maps from the teacher for the student. The indicator is based on the variances of feature maps which can efficiently quantitate the information richness of the input feature maps in a feature-agnostic manner. In the second step, we perform the knowledge distillation for the initialized student in first step with a novel grid-based mixing data augmentation technique which greatly extends the limited sample dataset. In this way, the student is able to refine its feature representation and achieves a better result. Extensive experiments on multiple benchmarks demonstrate the state-of-the-art performance of FSS. For example, by using 0.2% label-free data of full training set, FSS yields a 60% FLOPs reduction for DenseNet-40 on CIFAR-10 with only a loss of 0.8% in top-1 accuracy, achieving a result on par with that obtained by the conventional full-data methods.

Neural Network Compression Via Sparse Optimization

Efficient Neural Network Compression Inspired by Compressive Sensing.

Resource Constrained Model Compression via Minimax Optimization for Spiking Neural Networks

Neural Network Compression by Joint Sparsity Promotion and Redundancy Reduction

Compressing Deep Neural Networks With Sparse Matrix Factorization

Convolutional neural networks compression with low rank and sparse tensor decompositions

Towards Efficient Network Compression Via Few-Shot Slimming.

Exploring Structural Sparsity in Neural Image Compression

SparseNN: A Performance-Efficient Accelerator for Large-Scale Sparse Neural Networks

Neural Network Compression Framework for Fast Model Inference

Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization

On Compressing Deep Models by Low Rank and Sparse Decomposition.

Efficient Network Compression Through Smooth-Lasso Constraint

Low-Rank+Sparse Tensor Compression for Neural Networks

Minimally Invasive Surgery for Sparse Neural Networks in Contrastive Manner

An efficient pruning and fine-tuning method for deep spiking neural network

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

End-to-End Neural Network Compression via $\frac{\ell_1}{\ell_2}$ Regularized Latency Surrogates

LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time

On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee

Self-Compressing Neural Networks