Rethinking the Evaluation of Visible and Infrared Image Fusion

Dayan Guan,Yixuan Wu,Tianzhu Liu,Alex C. Kot,Yanfeng Gu
2024-10-09
Abstract:Visible and Infrared Image Fusion (VIF) has garnered significant interest across a wide range of high-level vision tasks, such as object detection and semantic segmentation. However, the evaluation of VIF methods remains challenging due to the absence of ground truth. This paper proposes a Segmentation-oriented Evaluation Approach (SEA) to assess VIF methods by incorporating the semantic segmentation task and leveraging segmentation labels available in latest VIF datasets. Specifically, SEA utilizes universal segmentation models, capable of handling diverse images and classes, to predict segmentation outputs from fused images and compare these outputs with segmentation labels. Our evaluation of recent VIF methods using SEA reveals that their performance is comparable or even inferior to using visible images only, despite nearly half of the infrared images demonstrating better performance than visible images. Further analysis indicates that the two metrics most correlated to our SEA are the gradient-based fusion metric $Q_{\text{ABF}}$ and the visual information fidelity metric $Q_{\text{VIFF}}$ in conventional VIF evaluation metrics, which can serve as proxies when segmentation labels are unavailable. We hope that our evaluation will guide the development of novel and practical VIF methods. The code has been released in \url{<a class="link-external link-https" href="https://github.com/Yixuan-2002/SEA/" rel="external noopener nofollow">this https URL</a>}.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenge of evaluation methods in visible - infrared image fusion (VIF). Due to the lack of ground truth, the existing VIF evaluation methods have limitations. The paper proposes a segmentation - oriented evaluation approach (SEA), which evaluates the quality of the fused image by using the semantic segmentation task. Specifically, SEA uses a general - purpose segmentation model to predict the segmentation output from the fused image and compares these outputs with the segmentation labels to evaluate the effect of the fusion method. ### Main Contributions 1. **Proposing a new evaluation method**: SEA solves the problem of lack of ground truth in VIF evaluation by introducing a general - purpose segmentation task and is applicable to multiple classes in different datasets. 2. **Comprehensive comparative study**: 30 of the latest VIF methods were evaluated using SEA and 15 traditional evaluation metrics, covering the latest datasets. 3. **Correlation analysis**: Through statistical correlation measurement, the consistency between SEA and traditional evaluation metrics was evaluated, and it was found that QABF and QVIFF are the two most correlated metrics and can be used as proxy metrics in the absence of segmentation labels. ### Background - **Visible - infrared image fusion**: Visible - light images provide rich color and texture information but are greatly affected by environmental factors; infrared images highlight targets but lack color and texture information. Therefore, fusing these two - modality images can improve the performance of visual tasks. - **Evaluation challenges**: Due to the lack of ground truth, the existing VIF evaluation methods are difficult to accurately evaluate the fusion effect. ### Method - **General - purpose segmentation model**: Three of the latest general - purpose segmentation models, X - Decoder, SEEM, and G - SAM, were selected. These models can handle diverse images and classes. - **Evaluation process**: 1. Generate the fused image using the VIF method. 2. Use the general - purpose segmentation model to predict the segmentation output from the fused image. 3. Compare the predicted segmentation output with the annotated segmentation labels, calculate the mIoU score, and evaluate the fusion quality. ### Experimental Results - **Performance comparison**: The experimental results show that many of the latest VIF methods are in some cases not even as effective as using only visible - light images, although infrared images perform better in some scenarios. - **Correlation analysis**: QABF and QVIFF have the highest correlation with SEA and can be used as evaluation metrics in the absence of segmentation labels. ### Conclusion The SEA method proposed in the paper provides a new and more reliable method for VIF evaluation, which is helpful to guide the future development of VIF methods and improve the quality of the fused image and the performance of downstream visual tasks.