Aggregated Residual Dilation-Based Feature Pyramid Network for Object Detection

Xiaotong Zhao,Wei Li,Yifan Zhang,Shuo Chang,Zhiyong Feng,Ping Zhang
DOI: https://doi.org/10.1109/access.2019.2941892
IF: 3.9
2019-01-01
IEEE Access
Abstract:In order to tackle the issue of multi-scale object detection, recent detectors usually adopt hierarchical feature pyramids which are generated by naive combinations of top-down features and lateral features. Considering the limited effective receptive fields of the methods for top-down features augmentation, the generated regions are only associated with the fixed areas of the coarser features. Meanwhile, noisy features are introduced by irrelevant regions inevitably since the finer features in relation to rigid coarser regions. Thus, the pyramidal features with strong semantics are difficult to be obtained via simply enlarging the top-down features. In this paper, we present the Aggregated Residual Dilation based Feature Pyramid Network (ARDFPN) to exploit the inherent correlation of regions in feature pyramid. The network is designed by stacking a building block that aggregates a set of dilated convolutions with the same topology. We show that carefully adding additional transformation stages into feature pyramid enables a potential way for further multi-scale feature generation. As an intuitive extension of Feature Pyramid Network (FPN), we conduct an exhaustive study to evaluate the model performance by replacing FPN with the proposed ARDFPN in both object detection and instance segmentation tasks. With Residual network in Faster R-CNN and Mask R-CNN framework, ARDFPN outperforms the prevalent detection module - FPN on the challenging COCO dataset without bells and whistles. In particular, ARDFPN exhibits a superior performance, especially for the small and middle objects.
What problem does this paper attempt to address?