Attention Based Data Augmentation for Knowledge Distillation with Few Data

Shengzhao Tian,Duanbing Chen
DOI: https://doi.org/10.1088/1742-6596/2171/1/012058
2022-01-01
Journal of Physics Conference Series
Abstract:Knowledge distillation has attracted great attentions from computer vision researchers in recent years. However, the performance of student model will suffer from the absence of the complete dataset, which is used to train the teacher model. Especially for conducting knowledge distillation between heterogeneous models, it is difficult for student model to learn and receive guidance with few data. In this paper, a data augmentation method is proposed based on the attentional response of teacher model. The proposed method utilizes the knowledge in teacher model without requiring homogeneous architecture between teacher model and student model. Experimental results demonstrate that combining the proposed data augmentation method with different knowledge distillation methods, the performance of student model can be improved in knowledge distillation with few data.
What problem does this paper attempt to address?