Bridging the Knowledge Gap via Transformer-Based Multi-Layer Correlation Learning

Hun-Beom Bak,Seung-Hwan Bae
DOI: https://doi.org/10.1109/access.2024.3387859
IF: 3.9
2024-04-20
IEEE Access
Abstract:We tackle a multi-layer knowledge distillation problem between deep models with heterogeneous architectures. The main challenges of that are the mismatches of the feature maps in terms of the resolution or semantic levels. To resolve this, we propose a novel transformer-based multi-layer correlation knowledge distillation (TMC-KD) method in order to bridge the knowledge gap between a pair of networks. Our method aims to narrow the relational knowledge gaps between teacher and student models by minimizing the local and global feature correlations. Based on extensive comparisons with the recent KD methods on classification and detection tasks, we prove the effectiveness and usefulness of our TMC-KD method.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?