A Feature Aggregation Network for Multispectral Pedestrian Detection

Yan Gong,Lu Wang,Lisheng Xu
DOI: https://doi.org/10.1007/s10489-023-04628-y
IF: 5.3
2023-01-01
Applied Intelligence
Abstract:Pedestrian detection is an important task in many computer vision applications. Since multispectral pedestrian detection can alleviate the difficulties of insufficient illumination at night, it has been rapidly developed in recent years. However, the way for effective color-thermal image fusion still needs further research. In this paper, we propose a Feature Aggregation Module (FAM) that can adaptively capture the cross-channel and cross-dimension information interaction of the two modalities. In addition, we develop a Feature Aggregation Network (FANet) that embeds the proposed FAM module into a two-stream network adapted from the YOLOv5. FANet has the advantages that its size is small (15 MB) and it runs fast (8 ms per frame). Extensive experiments on the KAIST dataset show that the proposed method is effective for multispectral pedestrian detection, especially in the night-time condition, for which the Miss Rate is only 8.91%. Moreover, we show that the saliency map computed from the thermal image can be incorporated into FANet to further improve the detection accuracy. In order to verify the generalization ability of the FAM module, we have also conducted experiments on the person re-identification datasets, namely Market1501 and Duke. The performance of our FAM compares favorably against existing feature fusion mechanisms on the two datasets.
What problem does this paper attempt to address?