Efficient Prior Calibration From Indirect Data

O. Deniz Akyildiz,Mark Girolami,Andrew M. Stuart,Arnaud Vadeboncoeur
2024-05-28
Abstract:Bayesian inversion is central to the quantification of uncertainty within problems arising from numerous applications in science and engineering. To formulate the approach, four ingredients are required: a forward model mapping the unknown parameter to an element of a solution space, often the solution space for a differential equation; an observation operator mapping an element of the solution space to the data space; a noise model describing how noise pollutes the observations; and a prior model describing knowledge about the unknown parameter before the data is acquired. This paper is concerned with learning the prior model from data; in particular, learning the prior from multiple realizations of indirect data obtained through the noisy observation process. The prior is represented, using a generative model, as the pushforward of a Gaussian in a latent space; the pushforward map is learned by minimizing an appropriate loss function. A metric that is well-defined under empirical approximation is used to define the loss function for the pushforward map to make an implementable methodology. Furthermore, an efficient residual-based neural operator approximation of the forward model is proposed and it is shown that this may be learned concurrently with the pushforward map, using a bilevel optimization formulation of the problem; this use of neural operator approximation has the potential to make prior learning from indirect data more computationally efficient, especially when the observation process is expensive, non-smooth or not known. The ideas are illustrated with the Darcy flow inverse problem of finding permeability from piezometric head measurements.
Machine Learning,Computation
What problem does this paper attempt to address?
### Problems the paper attempts to solve The paper aims to solve the problem of learning prior models from indirect data. Specifically, the paper focuses on how to learn prior models from multiple indirect data obtained through noisy observation processes. Prior models are usually used to describe the knowledge of unknown parameters before data acquisition. In many scientific and engineering applications, it is very difficult to directly observe unknown parameters, and these parameters can only be inferred from indirect data. ### Background and methods 1. **Bayesian inversion**: - Bayesian inversion is an important tool for quantifying uncertainty and is widely used in various problems in science and engineering. - Bayesian inversion requires four elements: a forward model, an observation operator, a noise model, and a prior model. - The forward model maps unknown parameters to the solution space, usually the solution space of partial differential equations. - The observation operator maps elements in the solution space to the data space. - The noise model describes how noise contaminates the observed data. - The prior model describes the knowledge of unknown parameters before data acquisition. 2. **Learning prior models**: - The paper proposes a method for learning prior models from indirect data. - The prior model is represented using a generative model, that is, the pushforward of a Gaussian distribution. - The pushforward mapping is learned by minimizing an appropriate loss function. - A residual - based neural operator approximation method is used to efficiently learn the forward model, which is particularly useful when the observation process is expensive, non - smooth, or unknown. ### Main contributions 1. **Selecting an appropriate divergence**: - The divergence \(d_1\) based on the sliced Wasserstein - 2 distance is introduced, and its computational feasibility is demonstrated. 2. **Residual - based probability loss function**: - A residual - based probability loss function is introduced to define the parameter \(\phi^*(\alpha)\) in the neural operator approximation. 3. **Computationally feasible objective function**: - It is shown that, in the case of defining \(\phi^*(\alpha)\), the objective function \(J_3(\alpha)\) is computationally feasible. 4. **Connection with Bayes' theorem**: - When \(N = 1\), minimizing the objective function \(J_1\) can be related to Bayes' theorem. 5. **Specific applications**: - The method is specifically described in the Darcy flow model of porous media flow, and the feasibility and consistency of the method are verified through numerical experiments. ### Mathematical expressions - **Observation model**: \[ y^{(n)}=G(z^{(n)})+\epsilon^{(n)} \] where \(\epsilon^{(n)}\sim\eta\) is independently and identically distributed noise. - **Forward model**: \[ y^{(n)}=g\circ F^{\dagger}(z^{(n)})+\epsilon^{(n)} \] where \(F^{\dagger}:Z\rightarrow U\) represents the mapping from the parameter space to the solution space, and \(g:U\rightarrow\mathbb{R}^{d_y}\) represents the mapping from the solution space to the data space. - **Objective function**: \[ J_1(\mu)=d_1\left(\nu,\eta*(g\circ F^{\dagger})_{\#}\mu\right)+H(\mu) \] \[ J_2(\alpha)=d_1\left(\nu,\eta*(g\circ F^{\dagger}\circ T_\alpha)_{\#}\mu_0\right)+h(\alpha) \] \[ J_3(\alpha)=d_1\left(\nu,\eta*(g\circ F_{\phi^*(\alpha)}\circ\right)