DiffLLE: Diffusion-guided Domain Calibration for Unsupervised Low-light Image Enhancement

Shuzhou Yang,Xuanyu Zhang,Yinhuai Wang,Jiwen Yu,Yuhan Wang,Jian Zhang
DOI: https://doi.org/10.48550/arXiv.2308.09279
2023-08-18
Abstract:Existing unsupervised low-light image enhancement methods lack enough effectiveness and generalization in practical applications. We suppose this is because of the absence of explicit supervision and the inherent gap between real-world scenarios and the training data domain. In this paper, we develop Diffusion-based domain calibration to realize more robust and effective unsupervised Low-Light Enhancement, called DiffLLE. Since the diffusion model performs impressive denoising capability and has been trained on massive clean images, we adopt it to bridge the gap between the real low-light domain and training degradation domain, while providing efficient priors of real-world content for unsupervised models. Specifically, we adopt a naive unsupervised enhancement algorithm to realize preliminary restoration and design two zero-shot plug-and-play modules based on diffusion model to improve generalization and effectiveness. The Diffusion-guided Degradation Calibration (DDC) module narrows the gap between real-world and training low-light degradation through diffusion-based domain calibration and a lightness enhancement curve, which makes the enhancement model perform robustly even in sophisticated wild degradation. Due to the limited enhancement effect of the unsupervised model, we further develop the Fine-grained Target domain Distillation (FTD) module to find a more visual-friendly solution space. It exploits the priors of the pre-trained diffusion model to generate pseudo-references, which shrinks the preliminary restored results from a coarse normal-light domain to a finer high-quality clean field, addressing the lack of strong explicit supervision for unsupervised methods. Benefiting from these, our approach even outperforms some supervised methods by using only a simple unsupervised baseline. Extensive experiments demonstrate the superior effectiveness of the proposed DiffLLE.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in unsupervised low - light image enhancement, the existing methods have insufficient effectiveness and generalization ability in practical applications. Specifically, the author believes that this problem is caused by the lack of explicit supervision and the inherent gap between real - world scenarios and the training data domain. For example, the existing low - light datasets are very ideally designed, but the night - time scenes in the real world are full of complex interfering factors, such as noise, artifacts and extreme lighting conditions. To address these problems, the paper proposes a domain - calibration framework based on the diffusion model, named DiffLLE (Diffusion - guided Domain Calibration for Unsupervised Low - Light Image Enhancement), aiming to bridge the gap between the real - world low - light domain and the training low - light domain through the powerful generation ability and progressive sampling mechanism of the diffusion model, while transforming the coarse enhancement solution space into a fine normal - light domain. Specifically, the paper designs two modules: 1. **Diffusion - guided Degradation Calibration (DDC)**: Through the domain calibration of the diffusion model and the luminance enhancement curve, the gap between the real world and the training low - light degradation is narrowed, enabling the enhancement model to perform robustly under complex real - world degradations. 2. **Fine - grained Target Domain Distillation (FTD)**: Using a pre - trained diffusion model to generate pseudo - references, the preliminarily recovered results are refined from the coarse normal - light domain to a higher - quality clean domain, thereby solving the problem of the lack of strong explicit supervision in unsupervised methods. Through these methods, the DiffLLE proposed in the paper not only performs excellently on standard benchmark datasets, but also achieves better results than other unsupervised methods and even some supervised methods in real - world low - light scenes.