Students and teachers learning together: a robust training strategy for neural network pruning

Liyan Xiong,Qingsen Chen,Jiawen Huang,Xiaohui Huang,Peng Huang,Shangfeng Wei
DOI: https://doi.org/10.1007/s00530-024-01315-x
IF: 3.9
2024-04-13
Multimedia Systems
Abstract:Convolutional neural networks (CNNs) serve as the backbone for extracting image features in the majority of computer vision tasks. In an attempt to make them deployable on small devices, many academics have released small neural networks that they developed by hand or employed compression on large models via model pruning. Model pruning is a simple and efficient way to speed up neural networks. However, the performance of the pruned model (sparse network) falls short of the original model (dense network), and it is not easy to train towards convergence. Recent popular work has focused on improving the effectiveness and convergence of sub-networks. In this paper, we present our solution from the perspective of how to narrow the performance gap between sparse and dense networks , rather than how to obtain a better sub-network . For bridging the gap in their performance, we propose a novel training strategy by way of mutual learning. Furthermore, we provide a new pruning criterion called matching distance (MD) that aims to enable the sparse networks to inherit the majority of the knowledge learned from the dense networks. The experimental results demonstrate that our approach enables knowledge from dense networks to be transferred to sparse networks more efficiently.
computer science, information systems, theory & methods
What problem does this paper attempt to address?