Infrared–Visible Image Fusion through Feature-Based Decomposition and Domain Normalization

Weiyi Chen,Lingjuan Miao,Yuhao Wang,Zhiqiang Zhou,Yajun Qiao
DOI: https://doi.org/10.3390/rs16060969
IF: 5
2024-03-11
Remote Sensing
Abstract:Infrared–visible image fusion is valuable across various applications due to the complementary information that it provides. However, the current fusion methods face challenges in achieving high-quality fused images. This paper identifies a limitation in the existing fusion framework that affects the fusion quality: modal differences between infrared and visible images are often overlooked, resulting in the poor fusion of the two modalities. This limitation implies that features from different sources may not be consistently fused, which can impact the quality of the fusion results. Therefore, we propose a framework that utilizes feature-based decomposition and domain normalization. This decomposition method separates infrared and visible images into common and unique regions. To reduce modal differences while retaining unique information from the source images, we apply domain normalization to the common regions within the unified feature space. This space can transform infrared features into a pseudo-visible domain, ensuring that all features are fused within the same domain and minimizing the impact of modal differences during the fusion process. Noise in the source images adversely affects the fused images, compromising the overall fusion performance. Thus, we propose the non-local Gaussian filter. This filter can learn the shape and parameters of its filtering kernel based on the image features, effectively removing noise while preserving details. Additionally, we propose a novel dense attention in the feature extraction module, enabling the network to understand and leverage inter-layer information. Our experiments demonstrate a marked improvement in fusion quality with our proposed method.
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper aims to address several key challenges in infrared and visible image fusion: 1. **Modal Differences**: There are significant modal differences between infrared and visible images, which lead to consistency issues in features during the fusion process, thereby affecting the quality of the fusion results. Modal differences include variations in wavelength, radiation sources, and acquisition sensors, which result in differences in texture, brightness, contrast, etc. 2. **Noise Issues**: Source images captured under low-light conditions usually contain a lot of noise, which severely affects the image fusion effect, leading to unsatisfactory fusion results. 3. **Intermediate Layer Information Loss**: Many existing fusion methods ignore important information in the intermediate layers, which plays a crucial role in the fusion process. Although dense connections are introduced into the fusion network, these connections increase computational costs. To address these challenges, the authors propose a new method (UNIFusion), which includes image decomposition based on cosine similarity, a unified feature space, and a dense attention mechanism. Specifically, this method improves fusion quality through the following steps: - **Image Decomposition**: Using cosine similarity to decompose infrared and visible images into common and unique regions. - **Unified Feature Space**: Converting infrared features into a pseudo-visible domain through Dynamic Instance Normalization (DIN) to eliminate modal differences. - **Dense Attention**: Introducing a dense attention mechanism in the feature extraction module, enabling the encoder to focus on more relevant features while ignoring redundant or irrelevant features. - **Non-local Gaussian Filter**: Designing a non-local Gaussian filter to reduce the impact of noise on the fusion results while preserving image details. Through these techniques, the UNIFusion method effectively reduces the impact of modal differences and noise while maintaining image details, thereby generating high-quality fused images.