Spatial and Channel-wise Attention in Multimodality Feature Fusion for Multispectral Pedestrian Detection

Hao Wei,Jifeng Shen,Xin Zuo,Wankou Yang
DOI: https://doi.org/10.1109/itia50152.2020.9312309
2020-01-01
Abstract:Multispectral pedestrian detection with feature fusion of thermal and visible images has achieved great success in all-day scenarios, but few attentions have been paid to the quality of feature fusion. In this paper, we have proposed an improved attention aware dual-stream Faster R-CNN algorithm to ameliorate the feature quality. Quantitative analysis of three different feature fusion methods demonstrate that the performance of detector is highly related to the quality of feature fusion. Therefore, an improved multi-step spatial and channel-wise attention method is proposed to improve the feature fusion quality step by step based on the dual-stream Faster R-CNN framework. With the aid of attention mechanism, the model learning is optimized in a more smooth way, which can effectively enhance the response of fused feature map in each stage. Experiments based on the KAIST and CVC-14 dataset demonstrate that the proposed method has achieved better performance compare to thebaseline method with a decrease of 4% and 8% in AMR respectively.
What problem does this paper attempt to address?