Automatic CNN Compression Based on Hyper-parameter Learning.

Nannan Tian,Yong Liu,Weiping Wang,Dan Meng
DOI: https://doi.org/10.1109/ijcnn52387.2021.9533329
2021-01-01
Abstract:Sparse regularization method, such as L1 or L2,1 regularization, is the most popular method which can induce sparse models. However, it introduces new hyper-parameters, which not only affects the degree of model sparsity, but also determines whether the model can be effectively trained. So how to automatically select hyper-parameters becomes an important and open problem for regularization-based model compression method. In general, we propose an automatic CNN model compression framework with cross-validation gradient which can automatically adjust the hyper-parameters and combine model parameter learning with hyper-parameter learning together. Specifically, in order to solve the hyper-parameter gradient (cross-validation gradient), we introduce auxiliary variables to transform the non-differentiable problem of L1 norm to a derivable form and obtain the derivative of model parameters with respect to hyper-parameters. Then the cross-validation gradient can be finally solved by the chain rule. Secondly, unlike common cross-validation methods, we propose a alternative learning methods for parameter learning with hyper-parameter learning. It is an unified framework which do not need to training from scratch after each hyper-parameters update which save a lot of time compared with manual parameter adjustment. Thirdly, we do not need to specify the sparsity rate which is also take much time for pruning methods. Classical CNN structures such as VGG, ResNet and DensNet are tested on CIFAR-10 and CIFAR-100 datasets to prove the effectiveness of our algorithm. Our code is avaliable at: https://github.com//tnn2018/AHLC.
What problem does this paper attempt to address?