Dual-branch network object detection algorithm based on dual-modality fusion of visible and infrared images

Li, Xinyue,Yang, Chen,Yu, Wangsheng
DOI: https://doi.org/10.1007/s00530-024-01540-4
IF: 3.9
2024-11-06
Multimedia Systems
Abstract:Aiming at the limitations of visible images in object detection, this paper proposes a dual-branch network object detection algorithm based on dual-modality fusion of visible and infrared images. Based on YOLOv7-s, the algorithm firstly introduces a spatial attention module to enhance the model's ability of capturing key features; secondly, to resolve the problem of inconsistent object sizes, a visible multi-scale feature fusion module is proposed, meanwhile, the structure of the SimCSPSPPF module (an improved spatial pyramid pooling module) from YOLOv6 is adopted to construct an infrared multi-scale feature fusion module to efficiently extract multi-scale features from infrared images; finally, a cross-modal feature fusion module is proposed to fuse corresponding scale features from visible and infrared images. The proposed algorithm is tested on KAIST, FLIR, and GIR datasets, experimental results show that the proposed algorithm has better performance, compared with the YOLOv7-s algorithm to detect visible and infrared images separately on the KAIST dataset, the detection accuracy is improved by 18.0 and 5.1%, respectively, and detection speed is 51.8 FPS; on FLIR and GIR datasets, the proposed algorithm also demonstrates significant advantages. Furthermore, the proposed algorithm can detect objects on individual visible or infrared images while maintaining high detection accuracy.
computer science, information systems, theory & methods
What problem does this paper attempt to address?