Joint Multi-Dimension Pruning via Numerical Gradient Update
Zechun Liu,Xiangyu Zhang,Zhiqiang Shen,Yichen Wei,Kwang-Ting Cheng,Jian Sun
DOI: https://doi.org/10.1109/tip.2021.3112041
IF: 10.6
2021-01-01
IEEE Transactions on Image Processing
Abstract:We present joint multi-dimension pruning (abbreviated as JointPruning), an effective method of pruning a network on three crucial aspects: spatial, depth and channel simultaneously. To tackle these three naturally different dimensions, we proposed a general framework by defining pruning as seeking the best pruning vector (i.e., the numerical value of layer-wise channel number, spatial size, depth) and construct a unique mapping from the pruning vector to the pruned network structures. Then we optimize the pruning vector with gradient update and model joint pruning as a numerical gradient optimization process. To overcome the challenge that there is no explicit function between the loss and the pruning vectors, we proposed self-adapted stochastic gradient estimation to construct a gradient path through network loss to pruning vectors and enable efficient gradient update. We show that the joint strategy discovers a better status than previous studies that focused on individual dimensions solely, as our method is optimized collaboratively across the three dimensions in a single end-to-end training and it is more efficient than the previous exhaustive methods. Extensive experiments on large-scale ImageNet dataset across a variety of network architectures MobileNet V1&V2&V3 and ResNet demonstrate the effectiveness of our proposed method. For instance, we achieve significant margins of 2.5% and 2.6% improvement over the state-of-the-art approach on the already compact MobileNet V1&V2 under an extremely large compression ratio.
computer science, artificial intelligence,engineering, electrical & electronic