Abstract:Despite the impressive advancements made in recent low-light image enhancement techniques, the scarcity of paired data has emerged as a significant obstacle to further advancements. This work proposes a mean-teacher-based semi-supervised low-light enhancement (Semi-LLIE) framework that integrates the unpaired data into model training. The mean-teacher technique is a prominent semi-supervised learning method, successfully adopted for addressing high-level and low-level vision tasks. However, two primary issues hinder the naive mean-teacher method from attaining optimal performance in low-light image enhancement. Firstly, pixel-wise consistency loss is insufficient for transferring realistic illumination distribution from the teacher to the student model, which results in color cast in the enhanced images. Secondly, cutting-edge image enhancement approaches fail to effectively cooperate with the mean-teacher framework to restore detailed information in dark areas due to their tendency to overlook modeling structured information within local regions. To mitigate the above issues, we first introduce a semantic-aware contrastive loss to faithfully transfer the illumination distribution, contributing to enhancing images with natural colors. Then, we design a Mamba-based low-light image enhancement backbone to effectively enhance Mamba's local region pixel relationship representation ability with a multi-scale feature learning scheme, facilitating the generation of images with rich textural details. Further, we propose novel perceptive loss based on the large-scale vision-language Recognize Anything Model (RAM) to help generate enhanced images with richer textual details. The experimental results indicate that our Semi-LLIE surpasses existing methods in both quantitative and qualitative metrics.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in the low - light image enhancement task, due to the lack of paired data, it is difficult for existing methods to further improve their performance. Specifically, although low - light image enhancement techniques have made significant progress in recent years, it is very difficult to obtain a large number of paired image data under real low - light and normal - light conditions. This limits the application of supervised learning methods because these methods usually require a large amount of paired data to train the model. In addition, the methods of synthesizing low - light images are quite different from the low - light images in real - life scenarios, resulting in poor performance of the models trained on synthetic data in practical applications and the problem of poor generalization ability. To solve these problems, the paper proposes a semi - supervised low - light image enhancement method (Semi - LLIE) based on the mean - teacher framework. This method aims to improve the generalization ability of the model in real - life scenarios by integrating unpaired data into model training. Specific improvement measures include: 1. **Semantic - aware Contrastive Loss**: In order to more effectively transfer the real illumination distribution and reduce color deviation, the paper introduces a semantic - based contrastive loss. This method uses the intermediate representations of large - scale vision - language models (RAM) to evaluate the semantic similarity between the original low - light image and its enhanced version, thereby generating enhanced images with natural colors. 2. **Mamba - based Low - light Image Enhancement Backbone Network**: In order to better restore the detail information in dark areas, the paper designs a new multi - scale state - space block (MSSB), which enhances the ability of the Mamba model in representing pixel relationships in local areas. By combining a multi - scale feature learning scheme, this backbone network can generate images with rich texture details. 3. **RAM - based Perceptual Loss**: In order to further improve the textural details of the enhanced images, the paper proposes a new RAM - based perceptual loss function. This loss function uses the intermediate features extracted from the last three stages of the RAM pre - trained image encoder to evaluate the perceptual similarity between the two input images, thereby helping to generate more realistic texture details. Through the above innovations, Semi - LLIE outperforms existing unsupervised methods in both quantitative and qualitative indicators, and in some cases even surpasses several influential supervised methods, especially in generating enhanced images with rich local details and natural colors, further promoting the performance of downstream object detection tasks.

Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image Enhancement

LMT-GP: Combined Latent Mean-Teacher and Gaussian Process for Semi-supervised Low-light Image Enhancement

Mutual Support and Promotion: Learning Structure Compensation and Context Completion for Low-Light Vision

Pseudo-supervised Low-light Image Enhancement with Mutual Learning

MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State Space

CodeEnhance: A Codebook-Driven Approach for Low-Light Image Enhancement

Low-Light Image and Video Enhancement Using Deep Learning: A Survey

Troublemaker Learning for Low-Light Image Enhancement

Joint semantic-aware and noise suppression for low-light image enhancement without reference

Fusion-Based Low-Light Image Enhancement

Semantically-guided low-light image enhancement

Low-light Image Enhancement Via Deep Retinex Decomposition and Bilateral Learning

Dark2Light: multi-stage progressive learning model for low-light image enhancement

PIE: Physics-Inspired Low-Light Enhancement

More Than Lightening: A Self-Supervised Low-Light Image Enhancement Method Capable for Multiple Degradations

SLLEN: Semantic-aware Low-light Image Enhancement Network

Multi-Level Contrastive Student-Teacher Structure for Semi-Supervised Medical Image Segmentation

Latent Disentanglement for Low Light Image Enhancement

Multi-path parallel enhancement of low-light images based on multiscale spatially aware Retinex decomposition