SwinMFF: toward high-fidelity end-to-end multi-focus image fusion via swin transformer-based network

Xinzhe Xie,Buyu Guo,Peiliang Li,Shuangyan He,Sangjun Zhou
DOI: https://doi.org/10.1007/s00371-024-03637-3
IF: 2.835
2024-10-06
The Visual Computer
Abstract:The end-to-end approach that directly learns the mapping from multi-focus images to fused images has been widely used recently, which achieves excellent performance in dealing with complex scenes. However, the fusion quality of this approach falls short of decision map-based methods, as this approach can preserve the original pixels of the focused regions in the fused image, while end-to-end methods use network inference results with pixel-wise regression errors, resulting in low fidelity of the fused images. To mitigate this limitation, we propose SwinMFF, which effectively captures long-range dependencies across the source images via the swin transformer to reduce pixel-wise regression errors, achieving high-fidelity end-to-end fusion while simultaneously alleviating edge artifacts in the fused image. Extensive experiments demonstrate that SwinMFF outperforms the other 28 state-of-the-art methods in both subjective visual quality and quantitative metrics. The codes are available at https://github.com/Xinzhe99/SwinMFF.
computer science, software engineering
What problem does this paper attempt to address?