Pruning DETR: efficient end-to-end object detection with sparse structured pruning

Huaiyuan Sun,Shuili Zhang,Xve Tian,Yuanyuan Zou
DOI: https://doi.org/10.1007/s11760-023-02719-4
2023-08-18
Abstract:Deep learning methods in the field of object detection have made significant progress in terms of performance, but end-to-end implementations still face challenges. Recently, the transformer-based DETR model successfully introduced the attention mechanism into object detection tasks, achieving end-to-end object detection. However, despite its competitive accuracy, DETR still falls short in terms of inference speed and computational costs. To address this issue, this paper proposes an optimization of the DETR model using structured pruning through sparsity-induced pruning, aiming to improve its inference speed and reduce computational costs. We adjust the importance of module outputs through parameter scaling factors and sparse regularization terms, and optimize the parameter scaling factors using an improved Accelerated Proximal Gradient (APG) method. Experimental results on the COCO dataset demonstrate that our approach achieves a computational cost reduction of over 32% while maintaining an AP value of 42.7%, resulting in an inference speed improvement of over 17% to reach 32.7 FPS. This study provides an effective solution for further enhancing the computational efficiency of transformer-based object detection models.
engineering, electrical & electronic,imaging science & photographic technology
What problem does this paper attempt to address?