Amortized Variational Inference for Deep Gaussian Processes

Qiuxian Meng,Yongyou Zhang
2024-09-19
Abstract:Gaussian processes (GPs) are Bayesian nonparametric models for function approximation with principled predictive uncertainty estimates. Deep Gaussian processes (DGPs) are multilayer generalizations of GPs that can represent complex marginal densities as well as complex mappings. As exact inference is either computationally prohibitive or analytically intractable in GPs and extensions thereof, some existing methods resort to variational inference (VI) techniques for tractable approximations. However, the expressivity of conventional approximate GP models critically relies on independent inducing variables that might not be informative enough for some problems. In this work we introduce amortized variational inference for DGPs, which learns an inference function that maps each observation to variational parameters. The resulting method enjoys a more expressive prior conditioned on fewer input dependent inducing variables and a flexible amortized marginal posterior that is able to model more complicated functions. We show with theoretical reasoning and experimental results that our method performs similarly or better than previous approaches at less computational cost.
Machine Learning
What problem does this paper attempt to address?
The problems that this paper attempts to solve are as follows: In Deep Gaussian Processes (DGPs), the existing sparse approximation methods have the problems of limited expressiveness and high computational cost. Specifically: 1. **Limited expressiveness**: Traditional Sparse GPs rely on independent inducing variables, which may not be sufficient to capture the key information in some complex problems, resulting in limited expressiveness of the model. 2. **High computational cost**: Due to its multi - layer structure, DGPs have higher computational complexity. Especially when dealing with large - scale data sets, the computational cost becomes a bottleneck. To solve these problems, the author introduced the Amortized Variational Inference (AVI) technique for DGPs. This method enables the model to more effectively use input - related inducing variables by learning a mapping function from the observed data to the variational parameters, thereby improving the expressiveness of the model and reducing the computational cost. ### Specific objectives: 1. **Improve the expressiveness of the model**: By using input - related inducing variables, the model can better capture the characteristics of complex functions. 2. **Reduce the computational cost**: By amortized variational inference, reduce the dependence on a large number of independent inducing variables, thereby reducing the computational complexity. 3. **Improve posterior approximation**: Propose a new marginal posterior distribution that can reduce the computational cost while maintaining high expressiveness. ### Method overview: - **Amortized Variational Inference (AVI)**: Learn a mapping function from the input data to the variational parameters, so that each observed data point can generate the corresponding variational parameters. - **Multi - layer structure**: Apply amortized variational inference in each layer of DGP, so that the variational parameters of each layer depend on the output of the previous layer. - **Optimization strategy**: Propose three amortization strategies (AR1, AR2, and AR2P), combined with Monte Carlo sampling, to efficiently approximate the Evidence Lower Bound (ELBO). ### Main contributions: - **More flexible prior**: Through input - related inducing variables, the prior conditions of the model are more flexible and can represent more complex functions. - **Efficient posterior approximation**: The proposed amortized variational inference method can significantly reduce the computational cost while maintaining high expressiveness. - **Theoretical and experimental verification**: Through theoretical analysis and experimental results, it is proved that the proposed method is superior or at least not inferior to the existing methods in performance, and at the same time has lower computational cost. In conclusion, this paper aims to improve the expressiveness and computational efficiency of Deep Gaussian Processes in dealing with complex functions and large - scale data sets by introducing the amortized variational inference technique.