Global Instance Relation Distillation for convolutional neural network compression

Haolin Hu,Huanqiang Zeng,Yi Xie,Yifan Shi,Jianqing Zhu,Jing Chen
DOI: https://doi.org/10.1007/s00521-024-09635-9
2024-01-01
Neural Computing and Applications
Abstract:Previous instance-relation knowledge distillation methods transfer structural relations between instances from the heavy teacher network to the lightweight student network, effectively enhancing the accuracy of the student. However, these methods have two limitations: (1) The modeling of relation knowledge only relies on the current mini-batch instances, causing the instance relations to be incomplete. (2) The information flow hidden in the evolution of instance relations throughout the network has been neglected. To address these problems, we propose a Global Instance Relation Distillation (GIRD) for convolutional neural network compression, which improves both the instance-level and relation-level globality. Firstly, we design a feature reutilization mechanism to store previously learned features to break through the shackles of the mini-batch. Secondly, we model the pairwise similarity-relation based on stored features to reveal more complete instance relations. Furthermore, we construct the pairwise relation-evolution across different layers to reflect the information flow. Extensive experiments on benchmark datasets demonstrate that our proposed method outperforms state-of-the-art approaches in various visual tasks.
What problem does this paper attempt to address?