Abstract:Existing image fusion approaches are committed to using a single deep network to solve different image fusion problems, achieving promising performance in recent years. However, devoid of the ground-truth output, in these methods, only the appearance from source images can be exploited during the training process to generate the fused images, resulting in suboptimal solutions. To this end, we advocate a self-evolutionary training formula by introducing a novel memory unit architecture (MUFusion). In this unit, specifically, we utilize the intermediate fusion results obtained during the training process to further collaboratively supervise the fused image. In this way, our fusion results can not only learn from the original input images, but also benefit from the intermediate output of the network itself. Furthermore, an adaptive unified loss function is designed based on this memory unit, which is composed of two loss items, i.e. , content loss and memory loss. In particular, the content loss is calculated based on the activity level maps of source images, which can constrain the output image to contain specific information. On the other hand, the memory loss is obtained based on the previous output of our model, which is utilized to force the network to yield fusion results with higher quality. Considering the handcrafted activity level maps cannot consistently reflect the accurate salience judgement, we put two adaptive weight items between them to prevent this degradation phenomenon. In general, our MUFusion can effectively handle a series of image fusion tasks, including infrared and visible image fusion, multi-focus image fusion, multi-exposure image fusion, and medical image fusion. Particularly, the source images are concatenated in the channel dimension. After that, a densely connected feature extraction network with two scales is used to extract the deep features of the source images. Following this, the fusion result is obtained by two feature reconstruction blocks with skip connections from the feature extraction network. Qualitative and quantitative experiments on 4 image fusion subtasks demonstrate the superiority of our MUFusion, compared to the state-of-the-art methods.

FusionDiff: A unified image fusion network based on diffusion probabilistic models

FusionDiff: Multi-focus image fusion using denoising diffusion probabilistic models

StackMFF: End-to-end Multi-Focus Image Stack Fusion Network

Diff-IF: Multi-modality image fusion via diffusion model with fusion knowledge prior

U2Fusion: A Unified Unsupervised Image Fusion Network

UNIFusion: A Lightweight Unified Image Fusion Network

DCAFuse: Dual-Branch Diffusion-CNN Complementary Feature Aggregation Network for Multi-Modality Image Fusion

MUFusion: A general unsupervised image fusion network based on memory unit

Dif-Fusion: Towards High Color Fidelity in Infrared and Visible Image Fusion with Diffusion Models

Conditional Controllable Image Fusion

DifFUSER: Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation

Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond

[Leptin and feeding regulation].

Multi-Focus Image Fusion Using U-Shaped Networks with a Hybrid Objective

UFA-FUSE: A novel deep supervised and hybrid model for multi-focus image fusion

Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model

DCFusion: Difference correlation-driven fusion mechanism of infrared and visible images

MMDRFuse: Distilled Mini-Model with Dynamic Refresh for Multi-Modality Image Fusion

DM-Fusion: Deep Model-Driven Network for Heterogeneous Image Fusion.

DDRF: Denoising Diffusion Model for Remote Sensing Image Fusion

A Task-guided, Implicitly-searched and Meta-initialized Deep Model for Image Fusion