UWFormer: Underwater Image Enhancement via a Semi-Supervised Multi-Scale Transformer

Weiwen Chen,Yingtie Lei,Shenghong Luo,Ziyang Zhou,Mingxian Li,Chi-Man Pun
2024-04-24
Abstract:Underwater images often exhibit poor quality, distorted color balance and low contrast due to the complex and intricate interplay of light, water, and objects. Despite the significant contributions of previous underwater enhancement techniques, there exist several problems that demand further improvement: (i) The current deep learning methods rely on Convolutional Neural Networks (CNNs) that lack the multi-scale enhancement, and global perception field is also limited. (ii) The scarcity of paired real-world underwater datasets poses a significant challenge, and the utilization of synthetic image pairs could lead to overfitting. To address the aforementioned problems, this paper introduces a Multi-scale Transformer-based Network called UWFormer for enhancing images at multiple frequencies via semi-supervised learning, in which we propose a Nonlinear Frequency-aware Attention mechanism and a Multi-Scale Fusion Feed-forward Network for low-frequency enhancement. Besides, we introduce a special underwater semi-supervised training strategy, where we propose a Subaqueous Perceptual Loss function to generate reliable pseudo labels. Experiments using full-reference and non-reference underwater benchmarks demonstrate that our method outperforms state-of-the-art methods in terms of both quantity and visual quality.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper aims to address the issue of poor underwater image quality, including color distortion and low contrast, which are primarily caused by the complex propagation characteristics of light in water. To tackle these problems, the paper proposes a new model called UWFormer (Underwater Image Enhancement Multi-Scale Transformer). Specifically, the paper improves upon several key issues present in existing methods: 1. **Insufficient Multi-Scale Enhancement and Global Perception Ability**: Current convolutional neural network (CNN)-based methods lack multi-scale enhancement capability and global perception ability. 2. **Dataset Pairing Problem**: In the real world, paired underwater image datasets are scarce, and using synthetic image pairs may lead to overfitting. To address the above issues, the paper presents the following main contributions: 1. **UWFormer Model**: This is a semi-supervised multi-scale transformer network for multi-frequency image enhancement. The model includes two innovative modules: Nonlinear Frequency-aware Attention (NFA) module and Multi-Scale Fusion Feed-forward Network (MSFN). These modules enable frequency-aware attention and variable receptive fields, significantly improving image restoration performance. 2. **Semi-Supervised Training Strategy**: To utilize unpaired datasets, the paper proposes a special semi-supervised training strategy. This includes a new loss function—Subaqueous Perceptual Loss (SPL), used to generate reliable pseudo-labels. 3. **Experimental Results**: Extensive experiments on 6 benchmark datasets show that UWFormer outperforms current state-of-the-art methods in both visual quality and quantitative metrics. In summary, by introducing a multi-scale transformer architecture and a novel semi-supervised learning strategy, the paper effectively addresses several challenges in the field of underwater image enhancement.