SDPDet: Learning Scale-Separated Dynamic Proposals for End-to-End Drone-View Detection

Nengzhong Yin,Chengxu Liu,Ruhao Tian,Xueming Qian
DOI: https://doi.org/10.1109/tmm.2024.3371892
IF: 7.3
2024-01-01
IEEE Transactions on Multimedia
Abstract:Detecting objects in large-scale drone-view images is notoriously challenging due to their uneven distribution and scale variation caused by photoing angles. Common approaches promote drone-view object detection by two-step detection (i.e., detecting sub-regions first) and multi-scale input. However, all these methods suffer from onerous computational costs since the high model complexity and input resolution. In this paper, we propose a novel one-step detector, called SDPDet, to enable effective object learning in drone-view images. In particular, a Scale-separated Activation Pyramid (SAP) serves to focus on the regions with objects aggregated at each scale, and a Scale-separated Learnable Proposals (SLP) mechanism learns proposal boxes and corresponding features on these regions. By such design, the quantity of learnable proposals allows dynamic adjustment at each scale separately, which facilitates the objects learning of various distributions and scales with less computational costs. Experiments demonstrate SDPDet can significantly outperform the state-of-the-art one-step detectors on three widely-used benchmarks. On the most challenging VisDrone dataset, SDPDet with ResNet50 gains 5.4% AP and 6.9% AP s improvements while running 1.9x faster than previous models.
What problem does this paper attempt to address?