Learning soft threshold for sparse reparameterization using gradual projection operators
Xiaodong Wang,Xianxian Zeng,Yun Zhang,Dong Li,Weijun Yang
DOI: https://doi.org/10.1016/j.neucom.2022.03.009
IF: 6
2022-06-01
Neurocomputing
Abstract:Deep neural networks (DNNs) have achieved great success in the field of computer vision in recent years. While being high-precision, the characteristic of over-parameterization impedes DNNs from being applied to lightweight devices. To obtain parameter-efficient networks, a large body of work based on uniform sparsity or heuristic non-uniform sparsity techniques has been explored. However, these sparsity techniques offer limited improvement in inference speed (FLOPs) and prediction accuracy. To make further progress, we propose a novel gradual projection operators (GPO) to learn the soft threshold for sparse reparameterization. GPO approaches the soft-threshold operator with a family of projection operators, which progressively reduces the gradient of weights to be pruned during training, to gently learn the pruning thresholds. Experiments on ImageNet show that ResNet-50 with the proposed training algorithm achieves 76.52% top-1 validation accuracy at the sparsity 81.62%, which has merely a 0.5% accuracy gap to its dense counterpart. Additionally, the non-uniform budgets learned by GPO can reduce the FLOPs by up to 10% compared to the state-of-the-arts, which is superior to the popular heuristics methods, thus yielding an effective mechanism for sparse reparameterization1.
computer science, artificial intelligence