CSFuser: A Cascade Siamese Fusion Architecture for RGB-Infrared Object Detection

Ziyi Li,Gang Zhang,Zhigang Zeng,Xiaolin Hu
DOI: https://doi.org/10.1007/978-981-97-4399-5_17
2024-01-01
Abstract:RGB-Infrared multi-modal object detection harnesses diverse and complementary information from RGB and infrared images, offering significant advantages in intelligent transportation. The primary challenge lies in the effective fusion of RGB and infrared images. Presently, the fusion process is hindered by two aspects: firstly, the misalignment between RGB and infrared images complicates the fusion process; secondly, the substantial differences in features between the two modalities impede the learning of complementary features. Existing fusion architectures often overlook these challenges or prioritize the RGB data, thereby neglecting the full potential of infrared data. To address these challenges, we introduce Multi-scale Attention-based Complementary Fusion (MACF) module, a straightforward yet effective feature fusion module embedded within the cascade siamese architecture. A lightweight alignment module is designed to align the two modalities before feature extraction. Through progressive fusion of RGB and infrared features, CSFuser addresses challenges directly. Extensive experiments on the DENSE dataset under adverse conditions like heavy snow, rain, and fog demonstrates CSFuser’s superiority over the leading method, HRFuser, while being 2x faster. The excellent performance underscores our method’s ability to effectively fuse RGB and infrared images. The codes will be made publicly available.
What problem does this paper attempt to address?