Abstract:Benefitting from the development of pyramidal feature learning, current state-of-the-art multi-scale detection paradigm has become proficient in detecting objects of varying scales. However, feature pyramid network (FPN), in spite of constructing multi-scale features with strong semantics, still suffers from limited performance caused by insufficient detail exploitation, information loss, limited receptive fields and hard proposal assignment, which can be mainly categorized into semantic level and instance level. To address these limitations, this paper analyzes the structural components that inhibit multi-scale feature representation and then presents a multi-stage progressive FPN (ProFPN) along with a novel RoI feature representation method called soft proposal assignment. In the semantic level, the bottom-up interaction module is first proposed to address to insufficient exploitation of high resolution features. In the bottom-up interaction module, global context attention blocks are utilized to interact adjacent-level features with detail information in a bottom-up progressive manner. After that, the top-down transfer module is designed to mitigate semantic information loss of high-level features. In the top-down transfer module, multi-branch asymmetric dilated blocks are adopted in a top-down progressive manner, which expands receptive fields to capture more object poses. In the instance level, to overcome the hard assignment of object proposals, a nonparametric strategy named soft proposal assignment is proposed to leverage the scale of each object proposal to generate dynamic weights for RoI features from adjacent levels. Comprehensive experiments conducted on MS COCO dataset demonstrate the superiority of ProFPN. By adding negligible extra FLOPs, the proposed ProFPN outperforms most pyramid-based methods. Moreover, due to the design of inherited feature utilization in ProFPN, transformer-based detectors have witnessed a substantial increase in detecting small objects while simultaneously achieving significant reductions in FLOPs. The source code of the proposed method is available at https://github.com/GingerCohle/ProFPN .

Bidirectional Matrix Feature Pyramid Network for Object Detection

Bidirectional Parallel Feature Pyramid Network for Object Detection

SAFPN: a Full Semantic Feature Pyramid Network for Object Detection

Single-Shot Bidirectional Pyramid Networks for High-Quality Object Detection.

MM-FPN: Multi-path and Multi-scale Feature Pyramid Network for Object Detection

Dynamic Feature Pyramid Networks for Detection

Complementary Feature Pyramid Network for Object Detection

Dual Attention Based Image Pyramid Network for Object Detection.

MGFPN: Enhancing Multi-Scale Feature for Object Detection

MFPN: A Novel Mixture Feature Pyramid Network of Multiple Architectures for Object Detection

Parallel Residual Bi-Fusion Feature Pyramid Network for Accurate Single-Shot Object Detection

Multi-Scale Residual Aggregation Feature Pyramid Network for Object Detection

Construct Effective Geometry Aware Feature Pyramid Network for Multi-Scale Object Detection

You Should Look at All Objects

AFPN: Asymptotic Feature Pyramid Network for Object Detection

ProFPN: Progressive feature pyramid network with soft proposal assignment for object detection

RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection

Cascaded Multi-Channel Feature Fusion for Object Detection.

Multi-scale redistribution feature pyramid for object detection

Enhanced semantic feature pyramid network for small object detection

Centralized Feature Pyramid for Object Detection