LR-FPN: Enhancing Remote Sensing Object Detection with Location Refined Feature Pyramid Network

Hanqian Li,Ruinan Zhang,Ye Pan,Junchi Ren,Fei Shen
2024-04-02
Abstract:Remote sensing target detection aims to identify and locate critical targets within remote sensing images, finding extensive applications in agriculture and urban planning. Feature pyramid networks (FPNs) are commonly used to extract multi-scale features. However, existing FPNs often overlook extracting low-level positional information and fine-grained context interaction. To address this, we propose a novel location refined feature pyramid network (LR-FPN) to enhance the extraction of shallow positional information and facilitate fine-grained context interaction. The LR-FPN consists of two primary modules: the shallow position information extraction module (SPIEM) and the contextual interaction module (CIM). Specifically, SPIEM first maximizes the retention of solid location information of the target by simultaneously extracting positional and saliency information from the low-level feature map. Subsequently, CIM injects this robust location information into different layers of the original FPN through spatial and channel interaction, explicitly enhancing the object area. Moreover, in spatial interaction, we introduce a simple local and non-local interaction strategy to learn and retain the saliency information of the object. Lastly, the LR-FPN can be readily integrated into common object detection frameworks to improve performance significantly. Extensive experiments on two large-scale remote sensing datasets (i.e., DOTAV1.0 and HRSC2016) demonstrate that the proposed LR-FPN is superior to state-of-the-art object detection approaches. Our code and models will be publicly available.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily addresses the issues present in existing Feature Pyramid Networks (FPN) for remote sensing object detection tasks. Specifically, the paper tackles the following two key problems: 1. **Insufficient extraction and utilization of low-level positional information**: Traditional FPN architectures often overlook the extraction of valuable positional information from the shallow layers of the network (i.e., low-level feature maps). This information is crucial for improving localization accuracy, especially in remote sensing scenarios where objects are densely packed. 2. **Inadequate context information interaction**: In existing FPN architectures, the interaction of information between feature maps is usually limited to simple channel fusion, lacking effective utilization of spatial information. This results in insufficient context information interaction. To address the above issues, the paper proposes a new network structure—Location Refinement Feature Pyramid Network (LR-FPN), which includes two main modules: - **Shallow Positional Information Extraction Module (SPIEM)**: Aims to maximize the retention of precise positional information of the target by extracting both positional and saliency information from low-level feature maps simultaneously. This helps to compensate for the positional information across different levels, maintaining the accuracy of target localization. - **Context Interaction Module (CIM)**: Injects reliable positional information into different levels of the original FPN through information interaction in both spatial and channel dimensions, enhancing the representation of target areas. This module also introduces a simple local and non-local interaction strategy to learn and retain the saliency information of the target. LR-FPN significantly improves detection performance on two large-scale remote sensing datasets (DOTA V1.0 and HRSC2016), and this method can be easily integrated into common object detection frameworks to enhance performance. Experimental results validate the effectiveness of LR-FPN and its potential application in the field of remote sensing object detection.