Multi-level Feature Fusion Pyramid Network for Object Detection

Zebin Guo,Hui Shuai,Guangcan Liu,Yisheng Zhu,Wenqing Wang
DOI: https://doi.org/10.1007/s00371-022-02589-w
IF: 2.835
2022-01-01
The Visual Computer
Abstract:Scale variation is one of the challenges in object detection. In this paper, we design a Multi-Level Feature Fusion Pyramid Network (MLFFPN) that can fuse features with different receptive fields so as to produce reliable object representations robust against scale variation. Specifically, we perform feature extraction on the backbone network with convolutional kernels of different sizes, reconstructing the feature pyramids with the various receptive fields by adding top-down paths and lateral connections. Then, the reconstructed feature pyramids are fused. Finally, the bottom-up path enhancement is added for the final prediction. To verify the proposed method, we constructed a large-scale object detection dataset containing in total 225,944 instances and 16,000 images of 30 classes of common objects. In this study, we introduce MLFFPN into the object detection network and conduct a series of experiments on our datasets and MSCOCO datasets. Without bells and whistles, MLFFPN achieves a considerable detection improvement over the baseline network.
What problem does this paper attempt to address?