Abstract:Abstract Infrared and visible image fusion aims to generate synthetic images including salient targets and abundant texture details. However, traditional techniques and recent deep learning-based approaches have faced challenges in preserving prominent structures and fine-grained features. In this study, we propose a lightweight infrared and visible image fusion network utilizing multi-scale attention modules and hybrid dilated convolutional blocks to preserve significant structural features and fine-grained textural details. First, we design a hybrid dilated convolutional block with different dilation rates that enable the extraction of prominent structure features by enlarging the receptive field in the fusion network. Compared with other deep learning methods, our method can obtain more high-level semantic information without piling up a large number of convolutional blocks, effectively improving the ability of feature representation. Second, distinct attention modules are designed to integrate into different layers of the network to fully exploit contextual information of the source images, and we leverage the total loss to guide the fusion process to focus on vital regions and compensate for missing information. Extensive qualitative and quantitative experiments demonstrate the superiority of our proposed method over state-of-the-art methods in both visual effects and evaluation metrics. The experimental results on public datasets show that our method can improve the entropy (EN) by 4.80%, standard deviation (SD) by 3.97%, correlation coefficient (CC) by 1.86%, correlations of differences (SCD) by 9.98%, and multi-scale structural similarity (MS_SSIM) by 5.64%, respectively. In addition, experiments with the VIFB dataset further indicate that our approach outperforms other comparable models.

SFNet: A Scaling-down Fusion Network for Infrared and Visible Images with Texture Attention

Multi-Scale Cross-Attention Fusion Network Based on Image Super-Resolution

Multi-scale attention-based lightweight network with dilated convolutions for infrared and visible image fusion

A Cross-scale Iterative Attentional Adversarial Fusion Network for Infrared and Visible Images

An infrared and visible image fusion network based on multi‐scale feature cascades and non‐local attention

SADFusion: A multi-scale infrared and visible image fusion method based on salient-aware and domain-specific

A Multi-Stage Visible and Infrared Image Fusion Network Based on Attention Mechanism

Multi-scale unsupervised network for infrared and visible image fusion based on joint attention mechanism

MAFusion: Multiscale Attention Network for Infrared and Visible Image Fusion

CAFNET: Cross-Attention Fusion Network for Infrared and Low Illumination Visible-Light Image

SFPFusion: An Improved Vision Transformer Combining Super Feature Attention and Wavelet-Guided Pooling for Infrared and Visible Images Fusion

Effect of Laser Resurfacing on p53 Expression in Photoaged Facial Skin

Visible and Infrared Image Fusion Based on Attention and Multiscale Residuals

Fusion of Infrared and Visible Images based on Spatial-Channel Attentional Mechanism

Integrating Parallel Attention Mechanisms and Multi-Scale Features for Infrared and Visible Image Fusion

IR-MSDNet: Infrared and Visible Image Fusion Based On Infrared Features and Multiscale Dense Network

SFDFusion: An Efficient Spatial-Frequency Domain Fusion Network for Infrared and Visible Image Fusion

Advancing infrared and visible image fusion with an enhanced multiscale encoder and attention-based networks

An end-to-end multi-scale network based on autoencoder for infrared and visible image fusion

[Reply to the discussion by Heinrich Kunze].

FSADFuse: A Novel Fusion Approach to Infrared and Visible Images