Learning a Holistic-Specific color transformer with Couple Contrastive constraints for underwater image enhancement and beyond

Debin Wei,Hongji Xie,Zengxi Zhang,Tiantian Yan
DOI: https://doi.org/10.1016/j.jvcir.2024.104059
IF: 2.887
2024-02-01
Journal of Visual Communication and Image Representation
Abstract:Underwater images suffer from different types of degradation due to medium characteristics and interfere with underwater tasks. While deep learning methods based on the Convolutional Neural Network (CNN) excel at detection tasks, they have inherent limitations when it comes to handling long-range dependencies. The enhanced images generated by these methods often have problems such as color cast, artificial traces and insufficient contrast. To address these limitations, we present a novel Holistic-Specific attention (HSA) mechanism based on the Vision Transformer (ViT). This mechanism allows us to capture global information in finer detail and perform initial enhancements on underwater images. Notably, even when combined with ViT, CNNs do not always approach the ideal state of image enhancement, as reference images themselves may involve human intervention. To tackle this, we design a loss function that incorporates contrastive learning, using the source image as a negative example. This approach guides the enhancement results to be closer to the ideal enhancement state while keeping away from the degraded state, not just closer to the reference. Additionally, we introduce patch-based contrastive learning to address the shortcomings of image-based methods in fine-detail correction. Our extensive qualitative and quantitative experiments demonstrate that the proposed method outperforms state-of-the-art techniques.
computer science, information systems, software engineering
What problem does this paper attempt to address?