LED-DETR: Lightweight, Efficient and Decoupled Object Detection with Transformers

Jianghao Wei,Wei Huang,Tao Hu,Qiang Liu,Jie Yu,Xiaoqiu Duan,Jiahuan Huang
DOI: https://doi.org/10.1145/3670105.3670171
2024-01-01
Abstract:The recently proposed DEtection TRansformer (DETR) removes hand-designed components and has become a new paradigm for object detection. However, the application of DETR is limited due to its expensive inference cost. While edge devices provide a solution for more challenging scenarios, they are unable to accommodate the expensive inference cost of DETR. To address this issue, we propose a Lightweight, Efficient, and Decoupled DETR, called LED-DETR. We divide DETR into two parts: the feature extractor and the object querier. In the feature extractor, we introduce lightweight convolutional neural networks as the backbone and efficient attention mechanisms as the self-attention mechanism for more lightweight. In the object querier, we incorporate the conditional spatial query for more performant. Compared with the original DETR, LED-DETR achieves a good trade-off between compute, memory footprint, and performance.
What problem does this paper attempt to address?