A Multi-scale Information Integration Framework for Infrared and Visible Image Fusion

Guang Yang,Jie Li,Hanxiao Lei,Xinbo Gao
2023-12-07
Abstract:Infrared and visible image fusion aims at generating a fused image containing the intensity and detail information of source images, and the key issue is effectively measuring and integrating the complementary information of multi-modality images from the same scene. Existing methods mostly adopt a simple weight in the loss function to decide the information retention of each modality rather than adaptively measuring complementary information for different image pairs. In this study, we propose a multi-scale dual attention (MDA) framework for infrared and visible image fusion, which is designed to measure and integrate complementary information in both structure and loss function at the image and patch level. In our method, the residual downsample block decomposes source images into three scales first. Then, dual attention fusion block integrates complementary information and generates a spatial and channel attention map at each scale for feature fusion. Finally, the output image is reconstructed by the residual reconstruction block. Loss function consists of image-level, feature-level and patch-level three parts, of which the calculation of the image-level and patch-level two parts are based on the weights generated by the complementary information measurement. Indeed, to constrain the pixel intensity distribution between the output and infrared image, a style loss is added. Our fusion results perform robust and informative across different scenarios. Qualitative and quantitative results on two datasets illustrate that our method is able to preserve both thermal radiation and detailed information from two modalities and achieve comparable results compared with the other state-of-the-art methods. Ablation experiments show the effectiveness of our information integration architecture and adaptively measure complementary information retention in the loss function.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily addresses the issue of fusing infrared images with visible light images. Specifically, the research aims to effectively measure and integrate the complementary information between different modality images (infrared images and visible light images) when generating fused images. #### Core Issues 1. **Effective Measurement of Complementary Information**: Existing methods mostly use simple weighting methods to determine the retention level of each modality's information, rather than adaptively measuring the complementary information between different image pairs. 2. **Information Integration in Structure and Loss Functions**: A multi-scale dual attention (MDA) framework is proposed to measure and integrate complementary information at both image and patch levels through structure and loss functions. #### Main Contributions 1. **Multi-Scale Dual Attention Framework**: A multi-scale dual attention framework is designed to extract features of different modality images at multiple spatial scales, utilizing pixel intensity and texture detail information. 2. **Dual Attention Fusion Block**: An information fusion block based on spatial and channel attention mechanisms is designed to determine the importance of significant spatial regions and channels. 3. **Measurement of Complementary Information**: The complementary information between infrared and visible light images is effectively measured through statistical methods, generating adaptive weight coefficients for each term in the loss function to constrain the differences between the fused result and the input images. 4. **Experimental Validation**: Extensive experiments were conducted on the TNO and RoadScene datasets, demonstrating the competitive performance of the proposed method in both qualitative and quantitative comparisons. Through the above methods, the paper aims to generate high-quality fused images that contain thermal radiation and detailed information, suitable for various application scenarios such as recognition, surveillance, and target detection.