Zhenning Shi,Haoshuai Zheng,Chen Xu,Changsheng Dong,Bin Pan,Xueshuo Xie,Along He,Tao Li,Huazhu Fu
Abstract:Recently, research on denoising diffusion models has expanded its application to the field of image restoration. Traditional diffusion-based image restoration methods utilize degraded images as conditional input to effectively guide the reverse generation process, without modifying the original denoising diffusion process. However, since the degraded images already include low-frequency information, starting from Gaussian white noise will result in increased sampling steps. We propose Resfusion, a general framework that incorporates the residual term into the diffusion forward process, starting the reverse process directly from the noisy degraded images. The form of our inference process is consistent with the DDPM. We introduced a weighted residual noise, named resnoise, as the prediction target and explicitly provide the quantitative relationship between the residual term and the noise term in resnoise. By leveraging a smooth equivalence transformation, Resfusion determine the optimal acceleration step and maintains the integrity of existing noise schedules, unifying the training and inference processes. The experimental results demonstrate that Resfusion exhibits competitive performance on ISTD dataset, LOL dataset and Raindrop dataset with only five sampling steps. Furthermore, Resfusion can be easily applied to image generation and emerges with strong versatility. Our code and model are available at <a class="link-external link-https" href="https://github.com/nkicsl/Resfusion" rel="external noopener nofollow">this https URL</a>.
What problem does this paper attempt to address?
This paper attempts to address the problem in image restoration tasks where traditional diffusion model methods start generating images from Gaussian white noise, leading to an increase in sampling steps and thus reducing efficiency. Specifically, since the degraded image already contains low-frequency information, starting the reverse process from Gaussian white noise is unnecessary and inefficient. Therefore, the paper proposes a new framework—Resfusion, which introduces a residual term in the forward diffusion process, starting the reverse process directly from the noise-degraded image, thereby reducing sampling steps and improving the efficiency and effectiveness of image restoration.
### Main Issues
1. **Increased Sampling Steps**: Traditional diffusion models start from Gaussian white noise, adding unnecessary sampling steps.
2. **Redundant Low-Frequency Information**: The degraded image already contains low-frequency information, making it unnecessary and inefficient to generate images from white noise.
3. **Model Performance Optimization**: A more efficient method is needed to improve the quality and speed of image restoration.
### Solution
- **Introducing Residual Term**: Introduce a residual term \( R \) in the forward diffusion process, defined as \( R = \hat{x}_0 - x_0 \), where \( \hat{x}_0 \) is the degraded image and \( x_0 \) is the real image.
- **Redefining the Forward Process**: Redefine the forward process as \( q(x_t | x_{t-1}, R) \) and determine the optimal acceleration steps through smooth equivalent transformation.
- **Unified Training and Inference Process**: By introducing residual noise (resnoise) and weighted residual terms, the reverse inference process is kept consistent with DDPM, thus unifying the training and inference process.
- **Simplified Noise Scheduling**: Existing noise scheduling can be directly used without the need to redesign complex noise scheduling.
### Experimental Results
- **ISTD Dataset**: In the shadow removal task, Resfusion significantly outperforms other methods in multiple metrics such as PSNR, SSIM, and MAE.
- **LOL Dataset**: In the low-light enhancement task, Resfusion performs excellently in multiple metrics such as PSNR, SSIM, and LPIPS.
- **Raindrop Dataset**: In the de-raining task, Resfusion also performs outstandingly in multiple metrics such as PSNR and SSIM.
### Contributions
1. **Introducing Residual Term**: By introducing a residual term, the reverse process starts from the noise-degraded image, reducing sampling steps.
2. **Clarifying the Relationship Between Residual and Noise Terms**: Provides a quantitative relationship between the residual term and noise term, named resnoise.
3. **Smooth Equivalent Transformation**: Determines the optimal acceleration steps through smooth equivalent transformation, unifying the training and inference process.
4. **Efficiency and Generality**: Resfusion is not only applicable to image restoration but can also be extended to image generation tasks, demonstrating strong generality and efficiency.
In summary, this paper effectively addresses the efficiency and performance issues of traditional diffusion models in image restoration tasks by introducing a residual term and redefining the diffusion process, showcasing superior performance in multiple image processing tasks.