AODet: Aerial Object Detection Using Transformers for Foreground Regions

Xiaoming Wang,Hao Chen,Xiangxiang Chu,Peng Wang
DOI: https://doi.org/10.1109/tgrs.2024.3407815
IF: 8.2
2024-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Aerial object detection is an important task and has received significant attention in recent years. Aerial images typically depict small and sparse instances against a simple background. Nevertheless, the simple background can only provide limited information. Based on the observation, we present a new transformer-based framework for aerial object detection. In contrast to previous methods that address sparsity through multi-stage pipelines involving Region-of-Interest (RoI) techniques or Sparse Convolutions, our method, referred as AODet, enjoy two significant advantages: 1) AODet is a simple yet accurate object detector which is specialized for aerial object detection. AODet identifies the background regions earlier and then only operates on the regions which most likely include the foreground objects, thereby significantly reducing the redundant computations. The utilization of transformer exploits more context information between foreground regions, helping to retain high-quality detection results. 2) Instead of involving the sparse operations like Sparse Convolutions or Clustering algorithms/ROI operations, AODet employs transformer to detect objects from foreground proposals. Our approach is simpler and can be easily implemented with simple tensor manipulations. Extensive experiments have conducted on VisDrone and DOTA. AODet achieves 40.9 AP on Visdrone and 79.6 mAP DOTA, demonstrating the effectiveness of AODet.
What problem does this paper attempt to address?