Abstract:Feature pyramids are widely used to improve scale invariance for object detection. Most methods just map the objects to feature maps with relevant square receptive fields, but rarely pay attention to the aspect ratio variation, which is also an important property of object instances. It will lead to a poor match between rectangular objects and assigned features with square receptive fields, thus preventing from accurate recognition and location. Besides, the information propagation among feature layers is sparse, namely, each feature in the pyramid may mainly or only contain single-level information, which is not representative enough for classification and localization sub-tasks. In this paper, Bidirectional Matrix Feature Pyramid Network (BMFPN) is proposed to address these issues. It consists of three modules: Diagonal Layer Generation Module (DLGM), Top-down Module (TDM) and Bottom-up Module (BUM). First, multi-level features extracted by backbone are fed into DLGM to produce the base features. Then these base features are utilized to construct the final feature pyramid through TDM and BUM in series. The receptive fields of the designed feature layers in BMFPN have various scales and aspect ratios. Objects can be correctly assigned to appropriate and representative feature maps with relevant receptive fields depending on its scale and aspect ratio properties. Moreover, TDM and BUM form bidirectional and reticular information flow, which effectively fuses multi-level information in top-down and bottom-up manner respectively. To evaluate the effectiveness of our proposed architecture, an end-to-end anchor-free detector is designed and trained by integrating BMFPN into FCOS. And the center-ness branch in FCOS is modified with our Gaussian center-ness branch (GCB), which brings another slight improvement. Without bells and whistles, our method gains +3.3%, +2.4% and +2.6% AP on MS COCO dataset from baselines with ResNet-50, ResNet-101 and ResNeXt-101 backbones, respectively.

Multi-scale Object Detection by Top-Down and Bottom-Up Feature Pyramid Network

Multi-scale Convolution Target Detection Algorithm with Feature Pyramid

MM-FPN: Multi-path and Multi-scale Feature Pyramid Network for Object Detection

EBiDA-FPN: Enhanced Bi-Directional Attention Feature Pyramid Network for Object Detection

MLA-Net: Feature Pyramid Network with Multi-Level Local Attention for Object Detection

Multi-Scale Residual Aggregation Feature Pyramid Network for Object Detection

A New Feature Pyramid Network for Object Detection

Bidirectional Matrix Feature Pyramid Network for Object Detection

Multi-level Feature Fusion Pyramid Network for Object Detection

Multi-scale Global Context Feature Pyramid Network for Object Detector

Multi-scale Vertical Cross-layer Feature Aggregation and Attention Fusion Network for Object Detection

Pyramid attention object detection network with multi-scale feature fusion

Dual Attention Based Image Pyramid Network for Object Detection.

Learning Discriminated Features Based on Feature Pyramid Networks and Attention for Multi-scale Object Detection

AFPN: Asymptotic Feature Pyramid Network for Object Detection

Scale-Insensitive Object Detection Via Attention Feature Pyramid Transformer Network

Feature Combination Based On Receptive Fields And Cross-Fusion Feature Pyramid For Object Detection

Feature Enhancement for Multi-scale Object Detection.

Stacked Pyramid Attention Network for Object Detection

Scale Adaptive Feature Pyramid Networks for 2D Object Detection

Annular Feature Pyramid Network for Salient Object Detection