Abstract:The Multi-Output Gaussian Process is is a popular tool for modelling data from multiple sources. A typical choice to build a covariance function for a MOGP is the Linear Model of Coregionalization (LMC) which parametrically models the covariance between outputs. The Latent Variable MOGP (LV-MOGP) generalises this idea by modelling the covariance between outputs using a kernel applied to latent variables, one per output, leading to a flexible MOGP model that allows efficient generalization to new outputs with few data points. Computational complexity in LV-MOGP grows linearly with the number of outputs, which makes it unsuitable for problems with a large number of outputs. In this paper, we propose a stochastic variational inference approach for the LV-MOGP that allows mini-batches for both inputs and outputs, making computational complexity per training iteration independent of the number of outputs.

What problem does this paper attempt to address?

The main problem this paper attempts to address is the computational complexity of Multi-Output Gaussian Processes (MOGP) when handling a large number of outputs. Specifically, the existing LV-MOGP models have a computational complexity that grows linearly with the number of outputs, making them impractical for large-scale problems. To overcome this limitation, the paper proposes a method based on Stochastic Variational Inference (SVI), which allows for mini-batch training of both inputs and outputs, thereby making the computational complexity of each training iteration independent of the number of outputs. ### Main Contributions: 1. **Double Stochastic Training Objective**: The paper proposes a new training objective that supports mini-batch training of both inputs and outputs, making the computational complexity of each training iteration independent of the number of outputs. 2. **Multiple Latent Variables Assumption**: The paper extends the assumption in LV-MOGP that each output uses only one latent variable by introducing multiple latent variables, allowing the model to construct more flexible covariance matrices. 3. **Support for Non-Gaussian Likelihoods**: The proposed method is not only applicable to Gaussian likelihoods but can also be extended to non-Gaussian likelihoods, such as Poisson likelihoods, making the model applicable to a wider range of datasets. 4. **Validation in Real Applications**: The paper validates the effectiveness and performance of the proposed method through multiple real datasets, including spatiotemporal climate modeling and spatial transcriptomics. ### Problems Addressed: - **Computational Complexity**: Traditional LV-MOGP models have high computational complexity when handling a large number of outputs, limiting their application in large-scale problems. The paper significantly reduces computational complexity by introducing stochastic variational inference and multiple latent variables. - **Flexibility and Applicability**: By introducing multiple latent variables and support for non-Gaussian likelihoods, the proposed model demonstrates higher flexibility and better performance when handling different types of datasets. ### Experimental Results: - **Exchange Rate Prediction**: On a small-scale dataset with 13 outputs, the GS-LVMOGP model performs comparably to existing methods, with performance improvements observed when increasing the number of latent variables. - **NYC Crime Data Modeling**: On a large-scale dataset with 447 outputs, the GS-LVMOGP model performs better with Poisson likelihood than with Gaussian likelihood, and prediction accuracy further improves with an increase in the number of latent variables. - **Spatiotemporal Temperature Modeling**: On a spatiotemporal temperature dataset with 1260 outputs, the GS-LVMOGP model excels in both interpolation and extrapolation tasks, particularly in accurately predicting temperatures at unseen locations during extrapolation tasks. - **Climate Forecasting**: On the USHCN climate dataset, the GS-LVMOGP model outperforms other methods in predicting subsequent observations. In summary, this paper effectively addresses the computational complexity issue of multi-output Gaussian processes when handling a large number of outputs by introducing double stochastic variational inference and multiple latent variables. The proposed method is validated through multiple real applications, demonstrating its effectiveness and superiority.

Scalable Multi-Output Gaussian Processes with Stochastic Variational Inference

Large Linear Multi-output Gaussian Process Learning

Modulating Scalable Gaussian Processes for Expressive Statistical Learning

Federated Automatic Latent Variable Selection in Multi-output Gaussian Processes

Non-stationary and Sparsely-correlated Multi-output Gaussian Process with Spike-and-Slab Prior

Variational Inference for Uncertainty on the Inputs of Gaussian Process Models

Scalable GAM using sparse variational Gaussian processes

Neural Operator Variational Inference based on Regularized Stein Discrepancy for Deep Gaussian Processes

Regularized Multi-Output Gaussian Convolution Process With Domain Adaptation

Amortized Variational Inference for Deep Gaussian Processes

Multi-output Gaussian process prediction for computationally expensive problems with multiple levels of fidelity

Scalable Semisupervised GMM for Big Data Quality Prediction in Multimode Processes.

Scalable Training of Inference Networks for Gaussian-Process Models.

Heterogeneous Multi-Task Gaussian Cox Processes.

Thoughts on Massively Scalable Gaussian Processes

Multi-modal Gaussian Process Variational Autoencoders for Neural and Behavioral Data

Sparse Convolved Multiple Output Gaussian Processes

Structured Variational Inference for Coupled Gaussian Processes

On Negative Transfer and Structure of Latent Functions in Multi-output Gaussian Processes

Scalable Bayesian Optimization Using Vecchia Approximations of Gaussian Processes

A Coreset-based, Tempered Variational Posterior for Accurate and Scalable Stochastic Gaussian Process Inference