Abstract:Remote sensing target detection aims to identify and locate critical targets within remote sensing images, finding extensive applications in agriculture and urban planning. Feature pyramid networks (FPNs) are commonly used to extract multi-scale features. However, existing FPNs often overlook extracting low-level positional information and fine-grained context interaction. To address this, we propose a novel location refined feature pyramid network (LR-FPN) to enhance the extraction of shallow positional information and facilitate fine-grained context interaction. The LR-FPN consists of two primary modules: the shallow position information extraction module (SPIEM) and the contextual interaction module (CIM). Specifically, SPIEM first maximizes the retention of solid location information of the target by simultaneously extracting positional and saliency information from the low-level feature map. Subsequently, CIM injects this robust location information into different layers of the original FPN through spatial and channel interaction, explicitly enhancing the object area. Moreover, in spatial interaction, we introduce a simple local and non-local interaction strategy to learn and retain the saliency information of the object. Lastly, the LR-FPN can be readily integrated into common object detection frameworks to improve performance significantly. Extensive experiments on two large-scale remote sensing datasets (i.e., DOTAV1.0 and HRSC2016) demonstrate that the proposed LR-FPN is superior to state-of-the-art object detection approaches. Our code and models will be publicly available.

What problem does this paper attempt to address?

The paper primarily addresses the issues present in existing Feature Pyramid Networks (FPN) for remote sensing object detection tasks. Specifically, the paper tackles the following two key problems: 1. **Insufficient extraction and utilization of low-level positional information**: Traditional FPN architectures often overlook the extraction of valuable positional information from the shallow layers of the network (i.e., low-level feature maps). This information is crucial for improving localization accuracy, especially in remote sensing scenarios where objects are densely packed. 2. **Inadequate context information interaction**: In existing FPN architectures, the interaction of information between feature maps is usually limited to simple channel fusion, lacking effective utilization of spatial information. This results in insufficient context information interaction. To address the above issues, the paper proposes a new network structure—Location Refinement Feature Pyramid Network (LR-FPN), which includes two main modules: - **Shallow Positional Information Extraction Module (SPIEM)**: Aims to maximize the retention of precise positional information of the target by extracting both positional and saliency information from low-level feature maps simultaneously. This helps to compensate for the positional information across different levels, maintaining the accuracy of target localization. - **Context Interaction Module (CIM)**: Injects reliable positional information into different levels of the original FPN through information interaction in both spatial and channel dimensions, enhancing the representation of target areas. This module also introduces a simple local and non-local interaction strategy to learn and retain the saliency information of the target. LR-FPN significantly improves detection performance on two large-scale remote sensing datasets (DOTA V1.0 and HRSC2016), and this method can be easily integrated into common object detection frameworks to enhance performance. Experimental results validate the effectiveness of LR-FPN and its potential application in the field of remote sensing object detection.

LR-FPN: Enhancing Remote Sensing Object Detection with Location Refined Feature Pyramid Network

Remote Sensing Object Detection Based on Receptive Field Expansion Block

ℱ3-Net: Feature Fusion and Filtration Network for Object Detection in Optical Remote Sensing Images

Feature Pyramid Full Granularity Attention Network for Object Detection in Remote Sensing Imagery

Discriminative Feature Pyramid Network For Object Detection In Remote Sensing Images

FPN with GMM Based Feature Enhancement Strategy for Object Detection in Remote Sensing Images.

Info-FPN: An Informative Feature Pyramid Network for object detection in remote sensing images

FDLR-Net: A feature decoupling and localization refinement network for object detection in remote sensing images

Attention-guided Context Feature Pyramid Network for Object Detection

Fine-Grained Object Detection in Remote Sensing Images Via Adaptive Label Assignment and Refined-Balanced Feature Pyramid Network.

FEFN: Feature Enhancement Feedforward Network for Lightweight Object Detection in Remote Sensing Images

Feature Alignment FPN for Oriented Object Detection in Remote Sensing Images

ReFPN-FCOS: One-Stage Object Detection for Feature Learning and Accurate Localization

CF2PN: A Cross-Scale Feature Fusion Pyramid Network Based Remote Sensing Target Detection

Dual Refinement Feature Pyramid Networks for Object Detection

Multiscale Deformable Attention and Multilevel Features Aggregation for Remote Sensing Object Detection

Layer-weakening Feature Fusion Network for Remote Sensing Detection

Cross-Layer Feature Pyramid Network for Salient Object Detection

Sfpn: Semantic Feature Pyramid Network For Object Detection

A2-FPN for Semantic Segmentation of Fine-Resolution Remotely Sensed Images