Adjustable Visible and Infrared Image Fusion

Boxiong Wu,Jiangtao Nie,Wei,Lei Zhang,Yanning Zhang
DOI: https://doi.org/10.1109/tcsvt.2024.3449638
IF: 5.859
2024-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:The visible and infrared image fusion (VIF) method aims to utilize the complementary information between these two modalities to synthesize a new image containing richer information. Although it has been extensively studied, the synthesized image that has the best visual results is difficult to reach consensus since users have different opinions. To address this problem, we propose an adjustable VIF framework termed AdjFusion, which introduces a global controlling coefficient into VIF to enforce it can interact with users. Within AdjFusion, a semantic-aware modulation module is proposed to transform the global controlling coefficient into a semantic-aware controlling coefficient, which provides pixel-wise guidance for AdjFusion considering both interactivity and semantic information within visible and infrared images. In addition, the introduced global controlling coefficient not only can be utilized as an external interface for interaction with users but also can be easily customized by the downstream tasks ( e.g ., VIF-based detection and segmentation), which can help to select the best fusion result for the downstream tasks. Taking advantage of this, we further propose a lightweight adaptation module for AdjFusion to learn the global controlling coefficient to be suitable for the downstream tasks better. Experimental results demonstrate the proposed AdjFusion can 1) provide ways to dynamically synthesize images to meet the diverse demands of users; and 2) outperform the previous state-of-the-art methods on both VIF-based detection and segmentation tasks, with the constructed lightweight adaptation method. Our code will be released after accepted at https://github.com/BearTo2/AdjFusion.
What problem does this paper attempt to address?