PR-Deformable DETR: DETR for Remote Sensing Object Detection
Yuepeng Chen,Bojun Liu,Luying Yuan
DOI: https://doi.org/10.1109/lgrs.2024.3483217
IF: 5.343
2024-11-01
IEEE Geoscience and Remote Sensing Letters
Abstract:Identifying objects in remote sensing images remains a critical challenge. However, remote sensing images typically encompass numerous small objects, significant variations in object sizes, and a dispersed distribution of objects, all of which pose challenges to the performance of existing object detectors. We present PR-Deformable DEtection Transformer (DETR), a novel model for remote sensing object detection to address these challenges. First, we introduce the tridirectional adaptive feature fusion pyramid network (TAFFPN) feature pyramid module to adaptively fuse data from diverse feature map layers, thereby enhancing the model's multiscale representation capability. Second, we propose the Res-Deformable Encoder, which integrates deformable encoders across different input scales via residual connections, generating feature vectors that capture rich semantic information of remote sensing objects. Last, we introduce the dynamic reference point module (DRPM) Decoder, which leverages 4-D reference points enriched with high-level (HL) feature priors to strengthen the model's object localization capabilities. Experimental results demonstrate that PR-Deformable DETR achieves state-of-the-art remote sensing object detection accuracy, achieving 88.3% mean average precision (mAP) on the NWPU VHR-10 dataset and 95.1% mAP on the RSOD dataset, with a corresponding 16% reduction in GFLOPs. These results satisfy the performance standards required for remote sensing object detection tasks.
imaging science & photographic technology,remote sensing,engineering, electrical & electronic,geochemistry & geophysics