A Novel Framework for Online Knowledge Distillation

Shixiao Fan,Minghui Liu,Chun Yang,Wentao Xia,Xuan Cheng,Ming Liu
DOI: https://doi.org/10.1109/iccc56324.2022.10065706
2022-01-01
Abstract:Traditional knowledge distillation transfers the capabilities of a large network to a smaller network through a two-stage training. Recent Online knowledge distillation uses aggregated intermediate predictions of multiple peer models as the goal of the peer model. Although this approach can get rid of large teacher models, the use of simple aggregation functions makes the problem of inter-peer homogenization severely, which affects the effectiveness of distillation. In this paper, we propose a novel online knowledge distillation strategy, which resists the homogeneity problem by augmenting the inputs randomly. Specifically, we construct a multi-network model, and intervene in the training process by implementing a set pattern of image enhancements, allowing the network to benefit from different enhancements during the training process, as a way to enrich the diversity among peers. Experimental results on CIFAR-10/CIFAR-100 show the significant improvement on several structures.
What problem does this paper attempt to address?