Abstract:Existing image fusion approaches are committed to using a single deep network to solve different image fusion problems, achieving promising performance in recent years. However, devoid of the ground-truth output, in these methods, only the appearance from source images can be exploited during the training process to generate the fused images, resulting in suboptimal solutions. To this end, we advocate a self-evolutionary training formula by introducing a novel memory unit architecture (MUFusion). In this unit, specifically, we utilize the intermediate fusion results obtained during the training process to further collaboratively supervise the fused image. In this way, our fusion results can not only learn from the original input images, but also benefit from the intermediate output of the network itself. Furthermore, an adaptive unified loss function is designed based on this memory unit, which is composed of two loss items, i.e. , content loss and memory loss. In particular, the content loss is calculated based on the activity level maps of source images, which can constrain the output image to contain specific information. On the other hand, the memory loss is obtained based on the previous output of our model, which is utilized to force the network to yield fusion results with higher quality. Considering the handcrafted activity level maps cannot consistently reflect the accurate salience judgement, we put two adaptive weight items between them to prevent this degradation phenomenon. In general, our MUFusion can effectively handle a series of image fusion tasks, including infrared and visible image fusion, multi-focus image fusion, multi-exposure image fusion, and medical image fusion. Particularly, the source images are concatenated in the channel dimension. After that, a densely connected feature extraction network with two scales is used to extract the deep features of the source images. Following this, the fusion result is obtained by two feature reconstruction blocks with skip connections from the feature extraction network. Qualitative and quantitative experiments on 4 image fusion subtasks demonstrate the superiority of our MUFusion, compared to the state-of-the-art methods.

MEFusion: Unsupervised Mutual Enhancement for Multimodal Image Fusion

CMEFusion: Cross-Modal Enhancement and Fusion of FIR and Visible Images

FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba

MEEAFusion: Multi-Scale Edge Enhancement and Joint Attention Mechanism Based Infrared and Visible Image Fusion

EMEF: Ensemble Multi-Exposure Image Fusion

Efficient Multi-exposure Image Fusion Via Filter-dominated Fusion and Gradient-driven Unsupervised Learning.

Mutual-Guided Dynamic Network for Image Fusion

Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond

Unsupervised Image Fusion Method based on Feature Mutual Mapping

MUFusion: A general unsupervised image fusion network based on memory unit

TUFusion: A Transformer-based Universal Fusion Algorithm for Multimodal Images

A Dual Domain Multi-exposure Image Fusion Network based on the Spatial-Frequency Integration

E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection

LeGFusion: Locally Enhanced Global Learning for Multimodal Image Fusion

Multi-Exposure Image Fusion via Deformable Self-Attention

AIM-MEF: Multi-exposure image fusion based on adaptive information mining in both spatial and frequency domains

Mutual information maximization and feature space separation and bi-bimodal mo-dality fusion for multimodal sentiment analysis

LeGFusion: Locally-enhanced Global Learning for Multi-Modal Image Fusion

RTFusion: A Multimodal Fusion Network with Significant Information Enhancement

MIMF: Mutual Information-Driven Multimodal Fusion

Deep Equilibrium Multimodal Fusion