DISTILLING DETR-LIKE DETECTORS WITH INSTANCE-AWARE FEATURE

Honglie Wang,Jian Xu,Shouqian Sun
DOI: https://doi.org/10.1109/icip46576.2022.9897586
2022-01-01
Abstract:DEtection TRansformer (DETR) has achieved great success in object detection but suffers from slow convergence during training process. Knowledge distillation (KD) can speed up model training but has the problem of locating knowledgeable regions in object detection. Related methods mainly locate knowledgeable regions empirically. To address those challenges, we propose a novel distillation framework for DETR-like transformer-based detectors. The key idea is to connect each instance with its corresponding response region in the feature map through cross attention. To better fuse the attention maps between different queries and heads, we introduce an attention fusion module to balance instances of different scales. Extensive experiments on DETR and Conditional DETR are conducted to verify our proposed method. Our method improves the mAP by 3.19% for Conditional DETR with ResNet-50 backbone trained for 50 epochs, which outperforms the strong teacher trained for 108 epochs. We also boost DETR with ResNet-50 backbone from 33.97% to 42.13% mAP (+8.16%) under 50 epochs.
What problem does this paper attempt to address?