Abstract:The proliferation of ultra-high-definition (UHD) imaging device is increasingly being used for underwater image acquisition. However, due to light scattering and underwater impurities, UHD underwater images often suffer from color deviations and edge blurriness. Many studies have attempted to enhance underwater images by integrating frequency domain and spatial domain information. Nonetheless, these approaches often interactively fuse dual-domain features only in the final fusion module, neglecting the complementary and guiding roles of frequency domain and spatial domain features. Additionally, the extraction of dual-domain features is independent of each other, which leads to the sharp advantages and disadvantages of the dual-domain features extracted by these methods. Consequently, these methods impose high demands on the feature fusion capabilities of the fusion module. But in order to handle UHD underwater images, the fusion modules in these methods often stack only a limited number of convolution and activation function operations. This limitation results in insufficient fusion capability, leading to defects in the restoration of edges and colors in the images. To address these issues, we develop a dual-domain interaction network for enhancing UHD underwater images. The network takes into account both frequency domain and spatial domain features to complement and guide each other’s feature extraction patterns, and fully integrates the dual-domain features in the model to better recover image details and colors. Specifically, the network consists of a U-shaped structure, where each layer is composed of dual-domain interaction transformer blocks containing interactive multi-head attention and interactive simple gate feed-forward networks. The interactive multi-head attention captures local interaction features of frequency domain and spatial domain information using convolution operation, followed by multi-head attention operation to extract global information of the mixed features. The interactive simple gate feed-forward network further enhances the model’s dual-domain interaction capability and cross-dimensional feature extraction ability, resulting in clearer edges and more realistic colors in the images. Experimental results demonstrate that the performance of our proposal in enhancing underwater images is significantly better than existing methods.

Attention-guided hybrid transformer-convolutional neural network for underwater image super-resolution

Multi-Scale Cross-Attention Fusion Network Based on Image Super-Resolution

Learning hybrid dynamic transformers for underwater image super-resolution

Underwater Image Enhancement via Adaptive Group Attention-Based Multiscale Cascade Transformer

An Efficient Hybrid CNN-Transformer Approach for Remote Sensing Super-Resolution

Unformer: A Transformer-Based Approach for Adaptive Multi-Scale Feature Aggregation in Underwater Image Enhancement

Hybrid-Scale Hierarchical Transformer for Remote Sensing Image Super-Resolution

An effective transformer based on dual attention fusion for underwater image enhancement

Enhancement of Underwater Images through Parallel Fusion of Transformer and CNN

RT-CBAM: Refined Transformer Combined with Convolutional Block Attention Module for Underwater Image Restoration

Remote Sensing Image Super-Resolution Using Enriched Spatial-Channel Feature Aggregation Networks

An efficient parallel fusion structure of distilled and transformer-enhanced modules for lightweight image super-resolution

Multi-scale dense spatially-adaptive residual distillation network for lightweight underwater image super-resolution

WaterFormer: A Global–Local Transformer for Underwater Image Enhancement With Environment Adaptor

Underwater-image super-resolution via range-dependency learning of multiscale features

Super-Resolution Algorithm Based on Transformer+CNN

Ultra-high-definition Underwater Image Enhancement Via Dual-Domain Interactive Transformer Network

Efficient Adaptive Feature Fusion Network for Remote-Sensing Image Super-Resolution

DTCNet: Transformer-CNN Distillation for Super-Resolution of Remote Sensing Image

A fusion framework with multi-scale convolution and triple-branch cascaded transformer for underwater image enhancement

HCT: a hybrid CNN and transformer network for hyperspectral image super-resolution