Cross-scale information enhancement for object detection

Tie-jun Li,Hui-feng Zhao,Li, Tie-jun
DOI: https://doi.org/10.1007/s11042-024-18737-4
IF: 2.577
2024-03-05
Multimedia Tools and Applications
Abstract:Object detection usually adopts multi-scale fusion to enrich the information of the object, and the Feature Pyramid Network (FPN) is a common method for multi-scale fusion. However, traditional fusion methods such as FPN cause information loss when fusing high-level feature maps with low-level feature maps. To solve these problems, we propose a simple but effective cross-scale fusion method that fully uses the information of multi-scale feature maps. In addition, to better utilize the multi-scale contextual information, we designed the Selective Information Enhancement (SIE) module. The SIE dynamically selects information at more important scales for objects of different size and fuse the selected information with feature maps for information enhancement. Apply our method to Single Shot Multibox Detector (SSD) and propose a Cross-Scale Information Enhancement Single Shot Multibox Detector (CESSD). The CESSD improves the object detection capability of SSD models by fusing multi-scale features and selectively enhancing feature map information. To evaluate the effectiveness of the model, we validated it on the Pascal VOC2007 test set for 300 × 300 inputs, and the mean Average Precision (mAP) of CESSD reached 79.8%.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?