Self-Calibrated Variance-Stabilizing Transformations for Real-World Image Denoising

Sébastien Herbreteau,Michael Unser
2024-07-25
Abstract:Supervised deep learning has become the method of choice for image denoising. It involves the training of neural networks on large datasets composed of pairs of noisy and clean images. However, the necessity of training data that are specific to the targeted application constrains the widespread use of denoising networks. Recently, several approaches have been developed to overcome this difficulty by whether artificially generating realistic clean/noisy image pairs, or training exclusively on noisy images. In this paper, we show that, contrary to popular belief, denoising networks specialized in the removal of Gaussian noise can be efficiently leveraged in favor of real-world image denoising, even without additional training. For this to happen, an appropriate variance-stabilizing transform (VST) has to be applied beforehand. We propose an algorithm termed Noise2VST for the learning of such a model-free VST. Our approach requires only the input noisy image and an off-the-shelf Gaussian denoiser. We demonstrate through extensive experiments the efficiency and superiority of Noise2VST in comparison to existing methods trained in the absence of specific clean/noisy pairs.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to effectively utilize the deep neural network specifically designed for removing Gaussian noise to denoise real - world images without specific clean/noise image pairs. Specifically, the author proposes a new method named Noise2VST. By learning an appropriate variance - stabilizing transformation (VST), the existing Gaussian denoising network can be directly applied to real - world noisy images. ### Main Problems and Challenges 1. **Limitations of Existing Methods**: - Supervised learning methods require a large number of clean/noise image pairs, which are difficult to obtain in practical applications. - Models trained for Gaussian noise perform poorly when dealing with real - world noise because real - world noise usually does not conform to the Gaussian distribution. - Methods for generating real clean/noise image pairs or training only with noise images have various limitations. 2. **Objectives**: - Propose a method that does not require specific clean/noise image pairs and can effectively utilize existing Gaussian denoising networks to process real - world noisy images. - By introducing an appropriate variance - stabilizing transformation (VST), make the noise distribution close to the Gaussian distribution, so that off - the - shelf Gaussian denoisers can be utilized. ### Solution Overview The author proposes the Noise2VST framework, which mainly includes the following aspects: 1. **Design of Variance - Stabilizing Transformation (VST)**: - A model - free VST in the form of a monotonically increasing piecewise - linear function is designed. - The VST and its inverse transformation are modeled by spline functions to ensure that the transformed noise is approximately Gaussian noise with a stable mean - square deviation. 2. **Self - Supervised Learning Strategy**: - Use a blind - spot denoiser for training, avoiding the need for clean images. - Optimize the VST parameters by minimizing the self - supervised loss function to ensure that the transformed images can be effectively processed by the Gaussian denoiser. 3. **Improvement in the Inference Stage**: - In the inference stage, replace the blind - spot denoiser with a traditional Gaussian denoiser to improve the denoising effect, especially to avoid artifacts such as the checkerboard effect. ### Experimental Results Through extensive experimental verification, Noise2VST outperforms existing zero - sample denoising methods on both synthetic noise and real - world noise datasets. Especially on real - world noise datasets such as FMDD and W2S, this method shows significant advantages. ### Formula Summary - Noise Model: \[ z_i\sim N(s_i,\sigma^2) \] where \(z_i\) is the noisy pixel value, \(s_i\) is the original pixel value, and \(\sigma^2\) is the noise variance. - Generalized Anscombe Transformation (GAT): \[ f_{\text{GAT}}:z\mapsto 2\sqrt{\max\left(az + \frac{3}{8a^2}+b,0\right)} \] - Loss Function: \[ L_{\bar{D}}^{\theta,\alpha,\beta}(z,z)=\|(f_{\theta,\alpha,\beta}^{-1}\circ\bar{D}\circ f_\theta)(z)-z\|_2^2 \] These formulas and methods together form the core of Noise2VST, enabling this method to effectively process noise in real - world images without relying on specific training data.