HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning

Zengxi Zhang,Zhiying Jiang,Long Ma,Jinyuan Liu,Xin Fan,Risheng Liu
2024-11-27
Abstract:Underwater images are often affected by light refraction and absorption, reducing visibility and interfering with subsequent applications. Existing underwater image enhancement methods primarily focus on improving visual quality while overlooking practical implications. To strike a balance between visual quality and application, we propose a heuristic invertible network for underwater perception enhancement, dubbed HUPE, which enhances visual quality and demonstrates flexibility in handling other downstream tasks. Specifically, we introduced an information-preserving reversible transformation with embedded Fourier transform to establish a bidirectional mapping between underwater images and their clear images. Additionally, a heuristic prior is incorporated into the enhancement process to better capture scene information. To further bridge the feature gap between vision-based enhancement images and application-oriented images, a semantic collaborative learning module is applied in the joint optimization process of the visual enhancement task and the downstream task, which guides the proposed enhancement model to extract more task-oriented semantic features while obtaining visually pleasing images. Extensive experiments, both quantitative and qualitative, demonstrate the superiority of our HUPE over state-of-the-art methods. The source code is available at <a class="link-external link-https" href="https://github.com/ZengxiZhang/HUPE" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper attempts to solve the problem of balancing visual quality and perceptual quality in underwater image enhancement. Specifically, existing methods mainly focus on improving the visual quality of underwater images while ignoring the impact on subsequent perception tasks (such as object detection, semantic segmentation, etc.). Therefore, the effectiveness of these methods in practical applications is limited. To solve this problem, the author proposes a Heuristic Underwater Perceptual Enhancement (HUPE), aiming to simultaneously improve the visual quality of underwater images and enhance their support for downstream tasks. The following are the main contributions of this paper: 1. **Information - Preserving Reversible Network**: By constructing a two - way mapping between underwater images and clear images, HUPE achieves an information - preserving reversible transformation. The forward process, as an enhancement process, describes the manifold structure of the image in the air; the backward process, as a constraint, effectively reduces artifacts and prevents information loss. 2. **Heuristic Prior Information**: Integrating heuristic prior information into the data - driven mapping process enhances the performance of the model in complex underwater environments and improves the interpretability of the entire framework. 3. **Semantic Collaborative Learning Module**: A semantic collaborative learning module is introduced to bridge the gap between visual enhancement and high - level semantic perception features during the training process. By embedding a feature collaboration module between the enhancement network and the downstream task network, the enhancement network can not only achieve visual enhancement but also further extract the semantic features of the image, thereby achieving perceptual enhancement. ### Formula Summary - **Heuristic Prior Injection**: \[ J_c(x)=\frac{1}{t(x)}I_c(x)+\frac{1}{t(x)}B_c(t(x) - 1),\quad c\in\{r,g,b\} \] where \(J\) represents the enhanced image, \(I\) represents the underwater image captured by the sensor, \(B\) represents the ambient light, and \(t\) represents the medium transmission coefficient. - **Fourier Transform**: \[ F(x)(i,j)=\sum_{h = 0}^{H - 1}\sum_{w = 0}^{W - 1}x(h,w)e^{-j2\pi\left(\frac{h i}{H}+\frac{w j}{W}\right)} \] - **Frequency - Domain Representation**: \[ F(x)=R(x)+jI(x) \] where \(R(x)\) and \(I(x)\) are the real part and the imaginary part respectively. - **Phase Spectrum and Amplitude Spectrum**: \[ A(x)(i,j)=\sqrt{R^2(x)(i,j)+I^2(x)(i,j)} \] \[ P(x)(i,j)=\arctan\left(\frac{I(x)(i,j)}{R(x)(i,j)}\right) \] - **Loss Function**: - **Contrast Loss**: \[ L_c=\sum_{i = 1}^{N}\rho_i\cdot\frac{\|VGG_i(I_r)-VGG_i(GE(I_u))\|_1}{\|VGG_i(I_u)-VGG_i(GE(I_u))\|_1} \] - **Frequency Loss**: \[ L_f=\|(F(GE(I_u)))-(F(I_r))\|_1 \] - **Bidirectional Loss**: \[ L_b=\|GE(I_u)-I_r\|_2+\|G^{-1}_E(I_r)-I_u\|_2 \] - **Total Enhancement Loss**: \[ L_e=\lambda_1L_c+\lambda_2L_f+\lambda_3L_b