A feature pyramid network with adaptive fusion strategy and enhanced semantic information

Longfei Qin,Wenchao Pang,Dexin Zhao
DOI: https://doi.org/10.1007/s00530-024-01378-w
IF: 3.9
2024-06-09
Multimedia Systems
Abstract:In order to better detect objects of different scales, detectors need different resolutions and inputs from different receptive fields. Currently, advanced detectors usually combine the structure of feature pyramid to achieve the fusion of multi-scale object features. Top-down and bottom-up network structure is the basic strategy of multi-scale feature extraction. Although the feature pyramid network(FPN) can alleviate the contradiction between resolution and receptive field to a certain extent, the existing models based on FPN tend to ignore the contradictory information between different layers in the fusion process, and some fuzzy boundary information is also prone to lose features in top-down propagation. This paper first introduces the detector, then analyzes the defects behind the feature pyramid network, and finally proposes a feature pyramid network(SG-FPN) with adaptive fusion strategy and enhanced semantic information to solve these problems. The validity of our model is verified on mainstream data sets, and the performance is superior compared with other state-of-the-art methods.
computer science, information systems, theory & methods
What problem does this paper attempt to address?