Self-Knowledge Distillation with Learning from Role-Model Samples.

Kai Xu,Lichun Wang,Huiyong Zhang,Baocai Yin
DOI: https://doi.org/10.1109/ICASSP48485.2024.10446387
2024-01-01
Abstract:Self-knowledge distillation does not require a pre-trained teacher network like traditional knowledge distillation. Existing methods either require additional parameters or require additional memory consumption. To alleviate this problem, this paper proposes a more efficient self-knowledge distillation method, named LRMS (learning from role-model samples). In every mini-batch, LRMS selects out a role-model sample for each sampled category, and takes its prediction as the proxy semantic for the corresponding category. Then, predictions of the other samples are constrained to be consistent with the proxy semantics, which makes the distribution of predictions for samples within the same category more compact. Meanwhile, the regularization targets corresponding to proxy semantics are set with a higher distillation temperature to better utilize the classificatory information about the categories. Experimental results show that diverse architectures achieve improvements on four image classification datasets by using LRMS. Code is acaliable: https://github.com/KAI1179/LRMS
What problem does this paper attempt to address?