MADMFuse: A Multi-Attribute Diffusion Model to Fuse Infrared and Visible Images

Hang Xu,Rencan Nie,Jinde Cao,Mingchuan Tan,Zhengze Ding
DOI: https://doi.org/10.1016/j.dsp.2024.104741
IF: 2.92
2024-01-01
Digital Signal Processing
Abstract:In the field of deep learning vision, infrared and visible image fusion (IVIF) has received significant attention due to its ability to enhance scene comprehension by combining the complementary features of both types of images. Existing methods based on generative models are either unstable in training, such as generative adversarial network, or only have a single denoising object, such as diffusion models, resulting in poor fused results that cannot fully contain multi-modal features. To solve these problems, we propose a multi-attribute diffusion model to fuse infrared and visible images, termed MADMFuse. Specifically, to preserve the salient information in infrared images and the texture details in visible images, we have designed a forward diffusion process with shared noise for multiple independent attributes, ensuring that the denoising network learns complementary features of different attributes simultaneously during training. Subsequently, our reverse process with multi-attribute as conditions employs the denoising network that iteratively reparameterizes noise, gradually adjusting and achieving fine image fusion. Furthermore, to address the issue of feature degradation resulting from solely focusing on a single denoising object, we derive a multiple diffusion object loss function based on pixel-level fusion target, the fidelity and luminance term in this loss function aim to guide the fused results towards visual similarity with the source image while preserving its brightness. Extensive experiments indicate that MADMFuse is more effective than other state-of-the-art image fusion methods.
What problem does this paper attempt to address?