Afmf: Adaptive Fusion of Multi-Scale Features for Pixel-Level Object Detection

Ying Yang,Hong Liang,Qianyin Chen,Qian Zhang,Chunlei Wu,Shanchen Pang
DOI: https://doi.org/10.2139/ssrn.3997537
2022-01-01
SSRN Electronic Journal
Abstract:Small targets have the characteristics of low resolution, less feature information, leading to the weak expression ability of extracted features, which will greatly hinder the improvement of object detection accuracy. In this paper, the pixel-level prediction and regression method based on Fully Convolutional Networks (FCN) is adopted to establish the one-stage anchor-free object detection model. In this model, Adaptive Spatial Pyramid Pooling (ASPP) module and Adaptive Spatial Fusion Pyramid Network (ASFPN) module are proposed. ASPP module, attached to the backbone network, can obtain more fine-grained features by enlarging the receptive field of original features, and adaptively aggregate the features of different receptive fields to enrich the context information of local areas. ASFPN module adaptively fuses multi-scale features to build feature pyramid with rich multi-scale context information to establish the connection among pixels. Meanwhile, residual connection is added to obtain spatial context information with ratio-invariance to reduce the loss of location information of original features. In the single-model and single-scale test, the detector AFMF proposed in this paper uses ResNext-64x4d-101 to achieve 44.3% AP on the MS COCO dataset, which surpasses that of the previous anchor-free one-stage detector based on FCN and maintains real-time detection.
What problem does this paper attempt to address?