Abstract:Infrared and visible image fusion aims at generating a fused image containing the intensity and detail information of source images, and the key issue is effectively measuring and integrating the complementary information of multi-modality images from the same scene. Existing methods mostly adopt a simple weight in the loss function to decide the information retention of each modality rather than adaptively measuring complementary information for different image pairs. In this study, we propose a multi-scale dual attention (MDA) framework for infrared and visible image fusion, which is designed to measure and integrate complementary information in both structure and loss function at the image and patch level. In our method, the residual downsample block decomposes source images into three scales first. Then, dual attention fusion block integrates complementary information and generates a spatial and channel attention map at each scale for feature fusion. Finally, the output image is reconstructed by the residual reconstruction block. Loss function consists of image-level, feature-level and patch-level three parts, of which the calculation of the image-level and patch-level two parts are based on the weights generated by the complementary information measurement. Indeed, to constrain the pixel intensity distribution between the output and infrared image, a style loss is added. Our fusion results perform robust and informative across different scenarios. Qualitative and quantitative results on two datasets illustrate that our method is able to preserve both thermal radiation and detailed information from two modalities and achieve comparable results compared with the other state-of-the-art methods. Ablation experiments show the effectiveness of our information integration architecture and adaptively measure complementary information retention in the loss function.

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper primarily addresses the issue of fusing infrared images with visible light images. Specifically, the research aims to effectively measure and integrate the complementary information between different modality images (infrared images and visible light images) when generating fused images. #### Core Issues 1. **Effective Measurement of Complementary Information**: Existing methods mostly use simple weighting methods to determine the retention level of each modality's information, rather than adaptively measuring the complementary information between different image pairs. 2. **Information Integration in Structure and Loss Functions**: A multi-scale dual attention (MDA) framework is proposed to measure and integrate complementary information at both image and patch levels through structure and loss functions. #### Main Contributions 1. **Multi-Scale Dual Attention Framework**: A multi-scale dual attention framework is designed to extract features of different modality images at multiple spatial scales, utilizing pixel intensity and texture detail information. 2. **Dual Attention Fusion Block**: An information fusion block based on spatial and channel attention mechanisms is designed to determine the importance of significant spatial regions and channels. 3. **Measurement of Complementary Information**: The complementary information between infrared and visible light images is effectively measured through statistical methods, generating adaptive weight coefficients for each term in the loss function to constrain the differences between the fused result and the input images. 4. **Experimental Validation**: Extensive experiments were conducted on the TNO and RoadScene datasets, demonstrating the competitive performance of the proposed method in both qualitative and quantitative comparisons. Through the above methods, the paper aims to generate high-quality fused images that contain thermal radiation and detailed information, suitable for various application scenarios such as recognition, surveillance, and target detection.

A Multi-scale Information Integration Framework for Infrared and Visible Image Fusion

Fusion of infrared and visual images through multiscale hybrid unidirectional total variation

Integrating Parallel Attention Mechanisms and Multi-Scale Features for Infrared and Visible Image Fusion

Multi-scale infrared and visible image fusion framework based on dual partial differential equations

SADFusion: A multi-scale infrared and visible image fusion method based on salient-aware and domain-specific

MEEAFusion: Multi-Scale Edge Enhancement and Joint Attention Mechanism Based Infrared and Visible Image Fusion

MDDCMA: A Distributed Image Fusion Framework Based on Multiscale Dense Dilated Convolution and Coordinate Mean Attention

A perceptual framework for infrared-visible image fusion based on multiscale structure decomposition and biological vision

Visible and Infrared Image Fusion Based on Attention and Multiscale Residuals

Multi-scale Convolutional Neural Networks and Saliency Weight Maps for Infrared and Visible Image Fusion

MAFusion: Multiscale Attention Network for Infrared and Visible Image Fusion

Multi-scale attention-based lightweight network with dilated convolutions for infrared and visible image fusion

Infrared–Visible Image Fusion through Feature-Based Decomposition and Domain Normalization

CHFusion: A Cross-modality High-resolution Representation Framework for Infrared and Visible Image Fusion

Infrared and visible image fusion based on infrared background suppression

Adaptive low light visual enhancement and high-significant target detection for infrared and visible image fusion

Infrared and visible image fusion based on edge-preserving filter and weighted least square optimization

Infrared-visible image fusion method based on multi-scale shearing Co-occurrence filter

Fusion of Infrared and Visible Images based on Spatial-Channel Attentional Mechanism

Infrared and Visible Image Fusion Based on Filtering Enhancement

DCFusion: Dual-Headed Fusion Strategy and Contextual Information Awareness for Infrared and Visible Remote Sensing Image