Abstract:While previous network compression methods achieve great success, most of them rely on the abundant training data which is, unfortunately, often unavailable in practice due to some reasons, e.g., privacy issues, storage constraints, and transmission limitations. A promising way to solve this problem is to perform compression with a few unlabeled data. Proceeding along this way, we propose a novel few-shot network compression framework named Few-Shot Slimming (FSS). FSS follows the student/teacher paradigm, and contains two steps: (1) construct the student by inheriting principal feature maps from the teacher; (2) refine the student feature representation by knowledge distillation with an enhanced mixing data augmentation method called GridMix. Specifically, in the first step, we employ normalized cross correlation to perform the principal feature analysis, and then theoretically construct a new indicator to select the most informative feature maps from the teacher for the student. The indicator is based on the variances of feature maps which can efficiently quantitate the information richness of the input feature maps in a feature-agnostic manner. In the second step, we perform the knowledge distillation for the initialized student in first step with a novel grid-based mixing data augmentation technique which greatly extends the limited sample dataset. In this way, the student is able to refine its feature representation and achieves a better result. Extensive experiments on multiple benchmarks demonstrate the state-of-the-art performance of FSS. For example, by using 0.2% label-free data of full training set, FSS yields a 60% FLOPs reduction for DenseNet-40 on CIFAR-10 with only a loss of 0.8% in top-1 accuracy, achieving a result on par with that obtained by the conventional full-data methods.

Fast Conditional Network Compression Using Bayesian HyperNetworks

Efficient Model Compression for Bayesian Neural Networks

Neural Network Compression Framework for Fast Model Inference

On Compression Principle and Bayesian Optimization for Neural Networks

Self-Compressing Neural Networks

Hyper-Compression: Model Compression via Hyperfunction

Multi-Context Dual Hyper-Prior Neural Image Compression

Structured Bayesian Compression for Deep Neural Networks Based on The Turbo-VBI Approach

Compression with Bayesian Implicit Neural Representations

Neural Network Compression Via Sparse Optimization

Towards Efficient Network Compression Via Few-Shot Slimming.

Efficient Network Compression Through Smooth-Lasso Constraint

Efficient Neural Network Compression Inspired by Compressive Sensing.

Efficient Bayesian CNN Model Compression using Bayes by Backprop and L1-Norm Regularization

Neural Network Compression by Joint Sparsity Promotion and Redundancy Reduction

Towards Explaining Deep Neural Network Compression Through a Probabilistic Latent Space

A Highly Efficient Training-Aware Convolutional Neural Network Compression Paradigm

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

Dictionary Pair-based Data-Free Fast Deep Neural Network Compression

Conditional Automated Channel Pruning for Deep Neural Networks

Multi-Resolution Model Compression for Deep Neural Networks: A Variational Bayesian Approach