Abstract:Knowledge distillation is a common and effective method in model compression, which trains a compact student model to mimic the capability of a large teacher model to get superior generalization. Previous works on knowledge distillation are underperforming for challenging tasks such as object detection, compared to the general application of unsophisticated classification tasks. In this paper, we propose that the failure of knowledge distillation on object detection is mainly caused by the imbalance between features of informative and invalid background. Not all background noise is redundant, and the valuable background noise after screening contains relations between foreground and background. Therefore, we propose a novel regional filtering distillation (RFD) algorithm to solve this problem through two modules: region selection and attention-guided distillation. Region selection first filters massive invalid backgrounds and retains knowledge-dense regions on near object anchor locations. Attention-guided distillation further improves distillation performance on object detection tasks by extracting the relations between foreground and background to migrate key features. Extensive experiments on both one-stage and two-stage detectors have been conducted to prove the effectiveness of RFD. For example, RFD improves 2.8% and 2.6% mAP for ResNet50-RetinaNet and ResNet50-FPN student networks on the MS COCO dataset, respectively. We also evaluate our method with the Faster R-CNN model on Pascal VOC and KITTI benchmark, which obtain 1.52% and 4.36% mAP promotions for the ResNet18-FPN student network, respectively. Furthermore, our method increases 5.70% of mAP for MobileNetv2-SSD compared to the original model. The proposed RFD technique performs highly on detection tasks through regional filtering distillation. In the future, we plan to extend it to more challenging task scenarios, such as segmentation.

Regional filtering distillation for object detection

Research on Knowledge Distillation Algorithm of Object Detection

Focal and Global Knowledge Distillation for Detectors

Dual Relation Knowledge Distillation for Object Detection

Structured Knowledge Distillation for Accurate and Efficient Object Detection

Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors.

Task-Balanced Distillation for Object Detection

'Parallel-Circuitized' distillation for dense object detection

Foreground separation knowledge distillation for object detection

DFD: Distillng the Feature Disparity Differently for Detectors

Instance-Conditional Knowledge Distillation for Object Detection

Distilling Object Detectors with Global Knowledge

Focal Distillation from High-Resolution Data to Low-Resolution Data for 3D Object Detection

Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-Guided Feature Imitation

Pixel Distillation: A New Knowledge Distillation Scheme for Low-Resolution Image Recognition

Efficient Object Detection in Optical Remote Sensing Imagery via Attention-Based Feature Distillation

Distilling Object Detectors With Fine-Grained Feature Imitation

Adaptive Knowledge Distillation for Lightweight Remote Sensing Object Detectors Optimizing

Distilling Object Detectors via Decoupled Features

Multilayer Semantic Features Adaptive Distillation for Object Detectors