KTAN: Knowledge Transfer Adversarial Network.

Peiye Liu,Wu Liu,Huadong Ma,Zhewei Jiang,Mingoo Seok
DOI: https://doi.org/10.1109/ijcnn48605.2020.9207235
2018-01-01
Abstract:Knowledge distillation was pioneered to transfer the generalization ability of a large teacher deep network to a light-weight student network. The student network can retain the high quality of the teacher network, yet exhibiting low computational complexity and storage requirement, which is attractive for deploying a deep convolution neural network on a resource-constrained mobile device. However, most of the existing methods focus on transferring the probability distribution of a softmax layer in a teacher network and neglect the intermediate representations. However, we find that the intermediate representation is critical for a student network to better understand the transferred generalization as compared to the probability distribution only. In this paper, therefore, we propose such a knowledge transfer adversarial network method which holistically considers both intermediate representations and probability distributions of a teacher network. To transfer the knowledge of intermediate representations, we set high-level teacher feature maps as a target, toward which the method trains student feature maps. Furthermore, to support various structures of a student network, we arrange a novel teacher-to-student layer. Finally, the proposed method employs an adversarial learning process. Specifically, it includes a discriminator network to fully exploit the spatial correlation of feature maps during the training process of a student network. The experimental results demonstrate that the proposed method can significantly improve the performance of a student network on two important vision tasks, image classification and object detection.
What problem does this paper attempt to address?