Sensitivity Pruner: Filter-Level Compression Algorithm for Deep Neural Networks

Suhan Guo,Bilan Lai,Suorong Yang,Jian Zhao,Furao Shen
DOI: https://doi.org/10.1016/j.patcog.2023.109508
IF: 8
2023-01-01
Pattern Recognition
Abstract:As neural networks get deeper for better performance, the demand for deployable models on resource-constrained devices also grows. In this work, we propose eliminating less sensitive filters to compress models. The previous method evaluates neuron importance using the connection matrix gradient in a single shot. To mitigate the sampling bias, we integrate this measure into the previously proposed "pruning while fine-tuning" framework. Besides classification errors, we introduce the difference between the learned and the single-shot strategy as the second loss component with a self-adjustive hyper-parameter that balances the training goal between improving accuracy and pruning more filters. Our Sensitivity Pruner (SP) adapts the unstructured pruning saliency metric to structured pruning tasks and enables the strategy to be derived sequentially to accommodate the updating sparsity. Experimental results demonstrate that SP significantly reduces the computational cost and the pruned models give comparable or better performance on CIFAR10, CIFAR100, and ILSVRC-12 datasets. (c) 2023 Elsevier Ltd. All rights reserved.
What problem does this paper attempt to address?