Explore a Novel Knowledge Distillation Framework for Network Learning and Low-Bit Quantization

Liang Si,Yuhai Li,Hengyi Zhou,Jiahua Liang,Longjun Liu
DOI: https://doi.org/10.1109/cac53003.2021.9728523
2021-01-01
Abstract:Knowledge distillation is a kind of model compression methods. It improves the performance of "student" networks by transferring the knowledge of "teacher" to "student". However, due to the huge gap between "teacher" and "student" in knowledge representation capabilities, directly minimizing the difference between them to transfer the knowledge will lead to the convergence issue. To this end, this paper proposes a novel framework of knowledge distillation. By sharing the weights of "teacher" and "student" in fully connected layers, we reduce the challenge of knowledge transfer. Furthermore, we propose a novel training strategy to improve the performance of low-bit quantization networks based on our distillation framework called distillation for low-bit quantization (DLBQ). The experimental results show that our methods can achieve significant improvement across different tasks. For instance, ResNet-20 gains 1.81% improvement on Cifar10 dataset. ResNet-56 shows 3.36% improvement on Cifar100 dataset and even exhibits 1.72% performance improvement than "teacher" network. Additionally, as for the improvements of quantization performance, ResNet-20 gain 1.25% improvement with ternary weights on Cifar10, and ResNet-110 manifects 3.22% improvement with binary weights on Cifar100 dataset.
What problem does this paper attempt to address?