A fusion framework with multi-scale convolution and triple-branch cascaded transformer for underwater image enhancement

Dan Xiang,Zebin Zhou,Wenlei Yang,Huihua Wang,Pan Gao,Mingming Xiao,Jinwen Zhang,Xing Zhu
DOI: https://doi.org/10.1016/j.optlaseng.2024.108640
IF: 5.666
2024-10-20
Optics and Lasers in Engineering
Abstract:Acquiring high-quality underwater images is critical for various marine applications. However, light absorption and scattering problems in underwater environments severely degrade image quality. To address these issues, this study proposes a Fusion Framework with Multi-Scale Convolution and Triple-Branch Cascaded Transformer for Underwater Image Enhancement(FMTformer). This innovative framework incorporates multi-scale convolution and three-branch cascade transformer to enhance underwater images effectively. The FMTformer framework adds in the Multi-Conv Multi-Scale Fusion (MCMF) mechanism, which utilizes a spectrum of convolutional kernels to adeptly extract multi-scale features from both the base and detail layers of the decomposed image. This method ensures the capture of both high- and low-frequency information. Furthermore, this research introduces the Tri-Branch Self-Attention Transformer (TBSAT), designed to get cross-dimensional interactions via its Tri-Branch structure, significantly refines image processing quality. The framework also embedded the Value Reconstruct Cascade Transformer (VRCT), which refines feature map representation through mixed convolution, yielding enriched attention maps. Empirical evidence indicates that FMTformer achieves parity with the state-of-the-art in both subjective and objective evaluation metrics, outperforming extant methodologies.
optics
What problem does this paper attempt to address?