Hierarchical Integration Diffusion Model for Realistic Image Deblurring

Zheng Chen,Yulun Zhang,Ding Liu,Bin Xia,Jinjin Gu,Linghe Kong,Xin Yuan
DOI: https://doi.org/10.48550/arXiv.2305.12966
2023-05-22
Computer Vision and Pattern Recognition
Abstract:Diffusion models (DMs) have recently been introduced in image deblurring and exhibited promising performance, particularly in terms of details reconstruction. However, the diffusion model requires a large number of inference iterations to recover the clean image from pure Gaussian noise, which consumes massive computational resources. Moreover, the distribution synthesized by the diffusion model is often misaligned with the target results, leading to restrictions in distortion-based metrics. To address the above issues, we propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring. Specifically, we perform the DM in a highly compacted latent space to generate the prior feature for the deblurring process. The deblurring process is implemented by a regression-based method to obtain better distortion accuracy. Meanwhile, the highly compact latent space ensures the efficiency of the DM. Furthermore, we design the hierarchical integration module to fuse the prior into the regression-based model from multiple scales, enabling better generalization in complex blurry scenarios. Comprehensive experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods. Code and trained models are available at https://github.com/zhengchen1999/HI-Diff.
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve several key problems in the image deblurring task: 1. **Computational efficiency problem**: Diffusion Models (DMs) perform excellently in the image deblurring task, especially in detail reconstruction. However, traditional diffusion models require a large number of inference iteration steps to recover a clear image from pure Gaussian noise, which consumes a large amount of computational resources. 2. **Distribution alignment problem**: The distribution synthesized by the diffusion model is usually inconsistent with the target result, resulting in poor performance on distortion - based metrics such as PSNR. 3. **Generalization ability in complex blurring scenarios**: Blurring in the real world is complex and non - uniform, and it is difficult to model with specific priors. Therefore, traditional methods perform poorly in such complex situations. To solve these problems, the authors propose the Hierarchical Integration Diffusion Model (HI - Diff). Specifically, HI - Diff improves existing methods in the following ways: - **Compact latent space**: Execute the diffusion model in a highly compressed latent space to generate prior features for the deblurring process. This method ensures the efficiency of the diffusion model. - **Combination of regression methods**: Adopt a regression - based method for deblurring processing to obtain better distortion accuracy. At the same time, the highly compressed latent space ensures the efficiency of the diffusion model. - **Hierarchical integration module**: Design a hierarchical integration module to fuse prior features from multiple scales into the regression model, thereby having better generalization ability in complex blurring scenarios. Through these improvements, HI - Diff shows performance superior to existing state - of - the - art methods on both synthetic and real - world blurring datasets. ### Formula summary 1. **Cross - attention formula**: \[ Q = W_Q X_r, \quad K = W_K z_i, \quad V = W_V z_i \] \[ \text{Attention}(Q, K, V)=\text{SoftMax}\left(\frac{QK^T}{\sqrt{\hat{C}}}\right)\cdot V \] where \(W_Q\in\mathbb{R}^{\hat{C}\times\hat{C}}\), \(W_K\in\mathbb{R}^{C'\times\hat{C}}\) and \(W_V\in\mathbb{R}^{C'\times\hat{C}}\) are learnable parameters for linear projection. 2. **Diffusion process formula**: \[ q(z_{1:T}|z_0)=\prod_{t = 1}^T q(z_t|z_{t - 1}),\quad q(z_t|z_{t - 1})=\mathcal{N}(z_t;\sqrt{1-\beta_t}z_{t - 1},\beta_t I) \] \[ q(z_t|z_0)=\mathcal{N}(z_t;\sqrt{\bar{\alpha}_t}z_0,(1-\bar{\alpha}_t)I),\quad\alpha_t = 1-\beta_t,\quad\bar{\alpha}_t=\prod_{i = 1}^t\alpha_i \] 3. **Reverse process formula**: \[ q(z_{t - 1}|z_t,z_0)=\mathcal{N}(z_{t - 1};\mu_t(z_t,z_0),\frac{1-\bar{\alpha}_{t - 1}}{1-\bar{\alpha}_t}\beta_t I) \]