Inverse Problems with Diffusion Models: A MAP Estimation Perspective

Sai Bharath Chandra Gutha,Ricardo Vinuesa,Hossein Azizpour
2024-09-18
Abstract:Inverse problems have many applications in science and engineering. In Computer vision, several image restoration tasks such as inpainting, deblurring, and super-resolution can be formally modeled as inverse problems. Recently, methods have been developed for solving inverse problems that only leverage a pre-trained unconditional diffusion model and do not require additional task-specific training. In such methods, however, the inherent intractability of determining the conditional score function during the reverse diffusion process poses a real challenge, leaving the methods to settle with an approximation instead, which affects their performance in practice. Here, we propose a MAP estimation framework to model the reverse conditional generation process of a continuous time diffusion model as an optimization process of the underlying MAP objective, whose gradient term is tractable. In theory, the proposed framework can be applied to solve general inverse problems using gradient-based optimization methods. However, given the highly non-convex nature of the loss objective, finding a perfect gradient-based optimization algorithm can be quite challenging, nevertheless, our framework offers several potential research directions. We use our proposed formulation to develop empirically effective algorithms for image restoration. We validate our proposed algorithms with extensive experiments over multiple datasets across several restoration tasks.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
### The problems the paper attempts to solve The paper aims to address the challenges in **inverse problems**, especially in image restoration tasks in computer vision. Specifically, the paper focuses on how to utilize **pre - trained unconditional diffusion models** to solve inverse problems without additional task - specific training. The inherent insolubility of determining the conditional score function in the reverse diffusion process in traditional methods is a major difficulty, which makes these methods rely only on approximations, thus affecting the practical performance. This paper proposes an approach based on the maximum a posteriori (MAP) estimation framework, modeling the reverse - conditional generation process of continuous - time diffusion models as an optimization process whose gradient terms are solvable. Theoretically, the proposed framework can use gradient - based optimization methods to solve general inverse problems. However, due to the highly non - convex nature of the loss objective, finding a perfect gradient - based optimization algorithm remains a challenge. Nevertheless, the framework provides several potential research directions. ### Specific background Inverse problems are very common in science and engineering and have a wide range of applications. In computer vision, many image restoration tasks (such as inpainting, deblurring, and super - resolution) can be formulated as inverse problems. Inverse problems are usually described by the following equation: \[ y = A(x)+\eta \] where \( y\in\mathbb{R}^m \) is the observed value of the original data \( x\in\mathbb{R}^n \), \(\eta\) is a random variable representing independent and identically distributed noise, usually assumed to be Gaussian noise, i.e., \(\eta\sim\mathcal{N}(0,\sigma_y^2I)\), and the task is to infer the original data \( x\) from the observed value \( y\). For linear inverse problems, \( A\) represents a linear mapping and can be replaced by a matrix \( H\in\mathbb{R}^{m\times n}\). ### Limitations of existing methods Traditional methods for solving inverse problems include those based on functional analysis, probability theory, data - driven methods, etc. In recent years, deep - learning - based methods have achieved remarkable success in solving inverse problems. In the Bayesian framework, solving inverse problems naturally corresponds to estimating the posterior \( P(x|y)\). Typical deep - learning - based methods are divided into two categories: 1. **Directly learning the posterior \( P(x|y)\)**: Achieved through conditional generation models. 2. **Learning the prior \( P(x)\)**: Achieved through unconditional generation models and used to infer \( P(x|y)\). The first type of method requires task - specific training, which limits the applicability of the model to different tasks. The second type of method trains an unconditional generation model to learn \( P(x)\), and this training is task - independent and only requires a dataset of original data samples \( x\). These methods then use the trained model \( P(x)\), and since \( P(y|x)\) is solvable (i.e., from equation (1) we know that \( P(y|x)=\mathcal{N}(A(x),\sigma_y^2I)\)), they use Bayes' rule to infer the posterior \( P(x|y)\propto P(y|x)P(x)\). ### Diffusion models and their applications Diffusion models are a recent family of generative models that generate data by simulating a stochastic process \( \{x(t)\}_{t = 0}^T \) described by a stochastic differential equation (SDE). The forward process starts from an initial clean data sample and gradually adds noise until it becomes a noisy sample of the prior distribution \( P_T\). The reverse process then converts the noisy sample of \( P_T\) into a clean sample of the data distribution, which is described by the corresponding reverse SDE: \[ dx=\left[ f(x,t)-g(t)^2\nabla_x\log P_t(x)\right]dt + g(t)d\bar{w} \] where \( \nabla_x\log P_t(x)\) is the score function of the marginal distribution \( P_t(x)\). If the score function of each marginal distribution is known, the reverse can be solved.