Abstract:Approximate inference in Gaussian process (GP) models with non-conjugate likelihoods gets entangled with the learning of the model hyperparameters. We improve hyperparameter learning in GP models and focus on the interplay between variational inference (VI) and the learning target. While VI's lower bound to the marginal likelihood is a suitable objective for inferring the approximate posterior, we show that a direct approximation of the marginal likelihood as in Expectation Propagation (EP) is a better learning objective for hyperparameter optimization. We design a hybrid training procedure to bring the best of both worlds: it leverages conjugate-computation VI for inference and uses an EP-like marginal likelihood approximation for hyperparameter learning. We compare VI, EP, Laplace approximation, and our proposed training procedure and empirically demonstrate the effectiveness of our proposal across a wide range of data sets.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve This paper attempts to address the issues encountered in hyperparameter learning under non-conjugate likelihoods in Gaussian Process (GP) models. Specifically, the paper focuses on the interaction between Variational Inference (VI) and hyperparameter learning and proposes an improved method. #### Background Issues 1. **Approximate Inference under Non-Conjugate Likelihoods**: - In the case of non-conjugate likelihoods, exact inference is not feasible, thus requiring the use of approximate inference methods. - Common approximate inference methods include Laplace Approximation (LA), Expectation Propagation (EP), and Variational Inference (VI). 2. **Hyperparameter Learning**: - Learning hyperparameters is crucial for the model's generalization ability on unseen data. - Existing methods typically jointly optimize variational parameters and hyperparameters, but this approach can lead to bias as the training objective is only representative of the variational parameters. #### Main Contributions of the Paper 1. **Improved Hyperparameter Learning Method**: - The paper proposes a hybrid training procedure that combines Natural-Gradient VI and EP-style marginal likelihood estimation. - This method ensures a good representation of the posterior in the E-step using the variational objective and performs hyperparameter learning in the M-step using the EP-style objective, thereby improving generalization ability without increasing computational cost. 2. **Experimental Validation**: - The paper demonstrates the effectiveness of the proposed hybrid training method on various datasets in binary classification tasks and student-t regression tasks. - Experimental results show that EP-style marginal likelihood estimation is closer to the MCMC baseline than VI, providing a better learning objective. ### Summary The paper aims to address the challenges of hyperparameter learning in Gaussian Process models under non-conjugate likelihoods. By designing a hybrid training procedure that combines Natural-Gradient Variational Inference and EP-style marginal likelihood estimation, the model's generalization ability is improved. Experimental results validate the effectiveness and practicality of the proposed method.

Improving Hyperparameter Learning under Approximate Inference in Gaussian Process Models

Pseudo Independent Conditional Approximation for Training the Mixtures of Gaussian Processes

Amortized Variational Inference for Deep Gaussian Processes

Variational Inference for Uncertainty on the Inputs of Gaussian Process Models

On the Approximation Accuracy of Gaussian Variational Inference

Learning inducing points and uncertainty on molecular data by scalable variational Gaussian processes

Scalable Training of Inference Networks for Gaussian-Process Models.

Variable Sigma Gaussian Processes: an Expectation Propagation Perspective

Iterative Construction of Gaussian Process Surrogate Models for Bayesian Inference

Convergence of Gaussian Process Regression with Estimated Hyper-Parameters and Applications in Bayesian Inverse Problems

Composite Gaussian Processes: Scalable Computation and Performance Analysis

Iterative Methods for Vecchia-Laplace Approximations for Latent Gaussian Process Models

Likelihood approximations via Gaussian approximate inference

Linear Time GPs for Inferring Latent Trajectories from Neural Spike Trains

Recommendations for Baselines and Benchmarking Approximate Gaussian Processes

Hybrid kernel approach to improving the numerical stability of machine learning for parametric equations with Gaussian processes in the noisy and noise-free data assumptions

Variational Linearized Laplace Approximation for Bayesian Deep Learning

Bayesian leave-one-out cross-validation approximations for Gaussian latent variable models

Scalable Bayesian Optimization Using Vecchia Approximations of Gaussian Processes

Beyond the Mean-Field: Structured Deep Gaussian Processes Improve the Predictive Uncertainties

Operator Learning with Gaussian Processes