Dual SIE-FPN: Semantic and Spatial Information Enhancement for Multiscale Object Detection

Mingjie Liu,Junhu Chen,Ping Liu,Junsheng Chen,Kyunghi Chang,Changhao Piao,Minglu Li
DOI: https://doi.org/10.1109/tii.2024.3441649
IF: 12.3
2024-01-01
IEEE Transactions on Industrial Informatics
Abstract:Feature pyramid network (FPN) can highly improve the performance of object detection by extracting multiscale features. However, current FPN-based methods suffer from intrinsic correlation of local information loss in each feature map, which brings about the semantic information effective transmission problem. In addition, 1 x 1 convolution in lateral connection of FPN may cause spatial information loss. In this article, we propose a novel semantic and spatial information enhancing feature pyramid network (Dual SIE-FPN), which mainly focuses on alleviating multiscale hierarchical feature transmission loss and enhancing the feature representation. Specifically, Dual SIE-FPN contains three modules: Lateral Feature Enhancement (LFE), Global Attention Upsampling (GAU), and Multiple Information Compensation (MIC). LFE is designed to capture deep semantic representation and enhance channel information. GAU is established to make up for spatial information loss caused by upsampling, and transmit the high-level features with the compensatory information to low-level features simultaneously. MIC is designed to work with LFE in parallel to further improve the information loss resulting from 1 x 1 convolution. Experimental results on MS COCO and UAVDT dataset demonstrate that Dual SIE-FPN achieves competitive performance compared to other state-of-the-art FPNs. In addition, our proposed Dual SIE-FPN can be embedded into any multiscale feature extraction-based computer vision tasks to improve the performance.
What problem does this paper attempt to address?