TUFusion: A Transformer-based Universal Fusion Algorithm for Multimodal Images

Yangyang Zhao,Qingchun Zheng,Peihao Zhu,Xu Zhang,Wenpeng Ma
DOI: https://doi.org/10.1109/tcsvt.2023.3296745
IF: 5.859
2023-01-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Multimodal image fusion is one of the important research directions in the field of multimodal fusion. This technique can realize image and data enhancement by using complementary multimodal images and be widely used in medicine, industry, security and fire protection, automatic driving and consumer electronics. In this work, we propose a transformer-based universal fusion (TUFusion) algorithm, and it has a multidomain fusion capability. The advantage of TUFusion algorithm is the design of hybrid transformer and convolutional neural network (CNN) encoder structure and a new composite attention fusion strategy, which has the ability of global and local information integration. Compared with the classical state-of-the-art multimodal image fusion methods, the experimental result on multidomain data sets showed that the TUFusion algorithm has certain universality in image fusion. Meanwhile, the TUFusion algorithm we proposed achieves good values on peak signal to noise ratio (PSNR), root mean square error (RMSE) and structural similarity index measure (SSIM). The code of the TUFusion algorithm in this article is available at https://github.com/windrunners/TUFusion.
engineering, electrical & electronic
What problem does this paper attempt to address?