MFMANet: a multispectral pedestrian detection network using multi-resolution RGB feature reuse with multi-scale FIR attentions
Jiaren Guo,Yuzhen Zhang,Jianyin Zheng,Zihao Huang,Yanyun Tao
DOI: https://doi.org/10.1007/s00138-024-01564-w
IF: 2.983
2024-06-16
Machine Vision and Applications
Abstract:In the realm of multispectral pedestrian detection, especially under challenging low-illumination, the existing methods, characterized by cross-modality feature interaction, lack generalization and are hard to achieve the optimal balance in multimodal interaction for different data distributions. To address these issues, we propose a more efficient network using multi-resolution feature reuse with multi-scale attention (MFMANet), tailored for multispectral pedestrian detection. An enhanced UNet is explored to reuse multi-resolution features, compensating the pixel-level information to decoder for object detection improvement. The contour and temperature attention mechanisms are strategically designed to focus on shape and crucial areas of pedestrians, effectively overcoming the loss of detail information commonly associated with RGB modality. Extensive experiments conducted on the KAIST and CVC-14 datasets validate the superior performance of MFMANet, with results indicating mean Average Precision (mAP) of 96.7% and 95.8%, and Miss Rate (MR) of 6.65% and 18.7%, respectively. These findings underscore the enhanced precision and computational efficiency of MFMANet, positioning it as a significant improvement over traditional methods in multispectral pedestrian detection.
computer science, cybernetics, artificial intelligence,engineering, electrical & electronic