Abstract:Infrared and visible image fusion aims at generating a fused image containing the intensity and detail information of source images, and the key issue is effectively measuring and integrating the complementary information of multi-modality images from the same scene. Existing methods mostly adopt a simple weight in the loss function to decide the information retention of each modality rather than adaptively measuring complementary information for different image pairs. In this study, we propose a multi-scale dual attention (MDA) framework for infrared and visible image fusion, which is designed to measure and integrate complementary information in both structure and loss function at the image and patch level. In our method, the residual downsample block decomposes source images into three scales first. Then, dual attention fusion block integrates complementary information and generates a spatial and channel attention map at each scale for feature fusion. Finally, the output image is reconstructed by the residual reconstruction block. Loss function consists of image-level, feature-level and patch-level three parts, of which the calculation of the image-level and patch-level two parts are based on the weights generated by the complementary information measurement. Indeed, to constrain the pixel intensity distribution between the output and infrared image, a style loss is added. Our fusion results perform robust and informative across different scenarios. Qualitative and quantitative results on two datasets illustrate that our method is able to preserve both thermal radiation and detailed information from two modalities and achieve comparable results compared with the other state-of-the-art methods. Ablation experiments show the effectiveness of our information integration architecture and adaptively measure complementary information retention in the loss function.

MCFusion: infrared and visible image fusion based multiscale receptive field and cross-modal enhanced attention mechanism

Multi-Scale Cross-Attention Fusion Network Based on Image Super-Resolution

MEEAFusion: Multi-Scale Edge Enhancement and Joint Attention Mechanism Based Infrared and Visible Image Fusion

MAFusion: Multiscale Attention Network for Infrared and Visible Image Fusion

CMEFusion: Cross-Modal Enhancement and Fusion of FIR and Visible Images

Integrating Parallel Attention Mechanisms and Multi-Scale Features for Infrared and Visible Image Fusion

DCFusion: A Dual-Frequency Cross-Enhanced Fusion Network for Infrared and Visible Image Fusion.

ICAFusion: Iterative cross-attention guided feature fusion for multispectral object detection

A Cross-scale Iterative Attentional Adversarial Fusion Network for Infrared and Visible Images

SCFusion: Infrared and Visible Fusion Based on Salient Compensation

SFCFusion: Spatial–Frequency Collaborative Infrared and Visible Image Fusion

CMRFusion: A cross-domain multi-resolution fusion method for infrared and visible image fusion

CHFusion: A Cross-modality High-resolution Representation Framework for Infrared and Visible Image Fusion

Fusion of Infrared and Visible Images based on Spatial-Channel Attentional Mechanism

Multi-scale attention-based lightweight network with dilated convolutions for infrared and visible image fusion

A Multi-scale Information Integration Framework for Infrared and Visible Image Fusion

Advancing infrared and visible image fusion with an enhanced multiscale encoder and attention-based networks

SADFusion: A multi-scale infrared and visible image fusion method based on salient-aware and domain-specific

Visible and Infrared Image Fusion Based on Attention and Multiscale Residuals

Rethinking Cross-Attention for Infrared and Visible Image Fusion

BCMFIFuse: A Bilateral Cross-Modal Feature Interaction-Based Network for Infrared and Visible Image Fusion