A Pyramid Attention Network with Edge Information Injection for Remote-Sensing Object Detection

Junjie Zhang,Anqi Ding,Guanyi Li,Liangang Zhang,Dan Zeng
DOI: https://doi.org/10.1109/lgrs.2023.3294395
IF: 5.343
2023-01-01
IEEE Geoscience and Remote Sensing Letters
Abstract:Remote-sensing images (RSIs) are often characterized by high spatial resolution, strong object scale effects, and complex scenes, which pose great challenges to object detection. Although mainstream neural network-based methods work well in detecting common objects, they often fail to fully exploit the detailed structural information in the spatial domain, leading to poor performance for objects with diverse scales and distributions under complicated backgrounds. To address the above issue, we propose a pyramid attention network with edge information injection for remote-sensing object detection (RSOD). Considering each object is composed of the inner body and outer profile parts that correspond to the low- and high-frequency (LF and HF) components of the image, respectively, the difference between the original image and its LF component is beneficial for obtaining the HF counterpart. We design the edge information extraction module (EIEM) to mine the detailed edge features at multiple scales and subsequently inject them into features at corresponding scales in the backbone network. As for promoting the performance in complex scenes, we introduce a pyramid feature fusion (PFF) module, which leverages both local and global attention for establishing the long-range channel dependency, thereby highlighting objects that need to be concentrated on. To verify the effectiveness of our proposed method, we conduct extensive experiments on detection in optical remote sensing images (DIOR) and RSOD datasets with mean average precision (mAP) reaching 74.93% and 96.44%, respectively, demonstrating that our model achieved state-of-the-art (SOTA) performance compared to mainstream methods.
What problem does this paper attempt to address?