Soft Taylor Pruning for Accelerating Deep Convolutional Neural Networks.

Jintao Rong,Xiyi Yu,Mingyang Zhang,Linlin Ou
DOI: https://doi.org/10.1109/iecon43393.2020.9254493
2020-01-01
Abstract:Networking pruning is widely utilized for accelerating the inference procedure of deep models in low-resource settings. In this paper, a novel Gradient-based method, Soft Taylor Pruning(STP), are proposed to reduce the network complexity in dynamic way. Given a global compression rate, all filters are categorized into two parts by the gradient-based evaluation criterion in the pruning process. Then, two types of filters are remixed and updated in the training epoch. When a pruning error occurs, the model can correct the pruning error by reactive pruned filters in the next training epoch. In this way, the capacity of the model channel space remains the same until the network structure converges. In order to reduce the impact of large-weighted filters on criterion, We take the absolute value of the product of the feature map and the gradient as the evaluation criterion. So as to reduce the model pruning time, STP allows simultaneous pruning on multiple layers by controlling the opening and closing of multiple mask layers. Moreover, STP can be applied to various advanced CNNs, such as MobileNet. The features of our method are: 1) maintain the integrity of the model channel space; 2) less time cost of model compression; 3)less dependence on the pretrained model.
What problem does this paper attempt to address?