ARPruning: An automatic channel pruning based on attention map ranking

Tongtong Yuan,Zulin Li,Bo Liu,Yinan Tang,Yujia Liu
DOI: https://doi.org/10.1016/j.neunet.2024.106220
IF: 7.8
2024-03-06
Neural Networks
Abstract:Structured pruning is a representative model compression technology for convolutional neural networks (CNNs), aiming to prune some less important filters or channels of CNNs. Most recent structured pruning methods have established some criteria to measure the importance of filters, which are mainly based on the magnitude of weights or other parameters in CNNs. However, these judgment criteria lack explainability, and it is insufficient to simply rely on the numerical values of the network parameters to assess the relationship between the channel and the model performance. Moreover, directly utilizing these pruning criteria for global pruning may lead to suboptimal solutions, therefore, it is necessary to complement search algorithms to determine the pruning ratio for each layer. To address these issues, we propose ARPruning (Attention-map-based Ranking Pruning), which reconstructs a new pruning criterion as the importance of the intra-layer channels and further develops a new local neighborhood search algorithm for determining the optimal inter-layer pruning ratio. To measure the relationship between the channel to be pruned and the model performance, we construct an intra-layer channel importance criterion by considering the attention map for each layer. Then, we propose an automatic pruning strategy searching method that can search for the optimal solution effectively and efficiently. By integrating the well-designed pruning criteria and search strategy, our ARPruning can not only maintain a high compression rate but also achieve outstanding accuracy. In our work, it is also experimentally concluded that compared with state-of-the-art pruning methods, our ARPruning method is capable of achieving better compression results. The code can be obtained at https://github.com/dozingLee/ARPruning .
computer science, artificial intelligence,neurosciences
What problem does this paper attempt to address?