Task Decoupled Knowledge Distillation for Lightweight Face Detectors.

Xiaoqing Liang,Xu Zhao,Chaoyang Zhao,Nanfei Jiang,Ming Tang,Jinqiao Wang
DOI: https://doi.org/10.1145/3394171.3414069
2020-01-01
Abstract:Face detection is a hot topic in computer vision. The face detection methods usually consist of two subtasks, i.e. the classification subtask and the regression subtask, which are trained with different samples. However, current face detection knowledge distillation methods usually couple the two subtasks, and use the same set of samples in the distillation task. In this paper, we propose a task decoupled knowledge distillation method, which decouples the detection distillation task into two subtasks and uses different samples in distilling the features of different subtasks. We firstly propose a feature decoupling method to decouple the classification features and the regression features, without introducing any extra calculations at inference time. Specifically, we generate the corresponding features by adding task-specific convolutions in the teacher network and adding adaption convolutions on the feature maps of the student network. Then we select different samples for different subtasks to imitate. Moreover, we also propose an effective probability distillation method to joint boost the accuracy of the student network. We apply our distillation method on a lightweight face detector, EagleEye. Experimental results show that the proposed method effectively improves the student detector's accuracy by 5.1%, 5.1%, and 2.8% AP in Easy, Medium, Hard subsets respectively.
What problem does this paper attempt to address?