Stochastic Ghost Batch for Self-distillation with Dynamic Soft Label

Qian Li,Qingyuan Hu,Saiyu Qi,Yong Qi,Di Wu,Yun Lin,Jin Song Dong
DOI: https://doi.org/10.1016/j.knosys.2021.107936
IF: 8.139
2022-01-01
Knowledge-Based Systems
Abstract:Deep neural networks excel at learning patterns from finite training data but often provide incorrect predictions with high confidence while faced out-of-distribution data. In this work, we propose a data-agnostic framework called Stochastic Ghost Batch Augmentation (SGBA) to address these issues. It stochastically augments activation units at training iterations to amendment the model’s irregular prediction behaviors by leveraging the partial generalization ability of intermediate model, in which a self-distilled dynamic soft label as regularization term is introduced to establish the aforementioned lost connection, that incorporates the similarity prior in the vicinity distribution respect to raw samples, rather than conform model to static hard label. Also, the induced stochasticity can reduce much unnecessary, redundant computational cost in conventional batch augmentation performed at each pass. The proposed regularization provides direct supervision by the KL-Divergence between the output soft-max distribution of original and virtual data, and enforces the distribution matching to fuse the complementary information in the model’s prediction, which are becoming gradually mature and stable with the training process. In essence, it is a dynamic check or test about the generalization of neural network during training. Extensive performance evaluations demonstrate the superiority of our proposed framework.
What problem does this paper attempt to address?