RITFusion: Reinforced Interactive Transformer Network for Infrared and Visible Image Fusion

Xiaoling Li,Yanfeng Li,Houjin Chen,Yahui Peng,Luyifu Chen,Minjun Wang
DOI: https://doi.org/10.1109/tim.2023.3342223
IF: 5.6
2024-01-01
IEEE Transactions on Instrumentation and Measurement
Abstract:Infrared (IR) and visible (VIS) image fusion aims to obtain one comprehensive image, which can be consistent with the source images on details and thermal targets. However, the existing image fusion methods do not work well when faced with low-quality images in adverse conditions, such as overexposure, low illumination, smoke occlusion, and similar backgrounds. To address this dilemma, we propose an effective reinforced interactive transformer network for IR and VIS image fusion (RITFusion). To integrate different features from two modalities, we develop a novel fusion strategy composed of the intramodality self-attention (IMSA) block, the modal reinforcement (MR) block, and the intermodality interactive-attention (IMIA) block for implementing information association and interaction between modalities, while strengthening the weakened features caused by the adverse conditions. Moreover, the multiscale skip connection network is deployed to make full use of features from the source images. Furthermore, we design a morphology-based loss function and combine it with the intensity loss function, which guides network training to retain adequate details and ample thermal targets. Extensive experiments on public datasets demonstrate that our RITFusion outperforms other state-of-the-art fusion methods under both the common and the adverse conditions. The extended experiment on salient object detection reveals that the proposed method can boost the detection performance for subsequent high-level vision tasks.
What problem does this paper attempt to address?