Self-knowledge Distillation Based on Knowledge Transfer from Soft to Hard Examples.

Yuan Tang,Ying Chen,Linbo Xie
DOI: https://doi.org/10.1016/j.imavis.2023.104700
IF: 3.86
2023-01-01
Image and Vision Computing
Abstract:To fully exploit knowledge from self-knowledge distillation network in which a student model is progressively trained to distill its own knowledge without a pre-trained teacher model, a self-knowledge distillation method based on knowledge transfer from soft to hard examples is proposed. A knowledge transfer module is designed to exploit the dark knowledge of hard examples, which can force the class probability consistency between hard and soft examples. It reduces the confidence of wrong prediction by transferring the class information from soft probability distributions of auxiliary self-teacher network to classifier network (self-student network). Further-more, a dynamic memory bank for softened probability distribution is introduced, whose updating strategy is also presented. Experiments show the method improves the accuracy by 0.64% on classification datasets in aver-age and by 3.87% on fine-grained visual recognition tasks in average, which makes its performance superior to the state-of-the-arts.(c) 2023 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?