Longitudinal Deep Kernel Gaussian Process Regression

Junjie Liang,Yanting Wu,Dongkuan Xu,Vasant Honavar
DOI: https://doi.org/10.48550/arXiv.2005.11770
2020-12-08
Abstract:Gaussian processes offer an attractive framework for predictive modeling from longitudinal data, i.e., irregularly sampled, sparse observations from a set of individuals over time. However, such methods have two key shortcomings: (i) They rely on ad hoc heuristics or expensive trial and error to choose the effective kernels, and (ii) They fail to handle multilevel correlation structure in the data. We introduce Longitudinal deep kernel Gaussian process regression (L-DKGPR), which to the best of our knowledge, is the only method to overcome these limitations by fully automating the discovery of complex multilevel correlation structure from longitudinal data. Specifically, L-DKGPR eliminates the need for ad hoc heuristics or trial and error using a novel adaptation of deep kernel learning that combines the expressive power of deep neural networks with the flexibility of non-parametric kernel methods. L-DKGPR effectively learns the multilevel correlation with a novel addictive kernel that simultaneously accommodates both time-varying and the time-invariant effects. We derive an efficient algorithm to train L-DKGPR using latent space inducing points and variational inference. Results of extensive experiments on several benchmark data sets demonstrate that L-DKGPR significantly outperforms the state-of-the-art longitudinal data analysis (LDA) methods.
Machine Learning
What problem does this paper attempt to address?
The problems that this paper attempts to solve are two key challenges faced when conducting predictive modeling in longitudinal data: 1. **Difficulty in Kernel Selection**: Existing methods rely on heuristic methods or expensive trial - and - error processes when selecting effective kernels. These methods are not only time - consuming but may also lead to sub - optimal selections, affecting the performance of the model. 2. **Handling of Multilevel Correlation Structures**: Existing methods have difficulty handling the multilevel correlation structures (MC) in the data, that is, the complex interactions between time - varying effects and time - invariant effects. Such multilevel correlation structures are usually complex and a priori unknown, and failure to fully consider these structures will lead to biases in statistical inferences. To overcome these problems, the author introduced a new method - Longitudinal Deep Kernel Gaussian Process Regression (L - DKGPR). L - DKGPR solves the above problems in the following ways: - **Automatically Discovering Complex Correlation Structures**: L - DKGPR combines the expressive power of deep neural networks and the flexibility of non - parametric kernel methods using deep kernel learning methods, thereby automatically discovering the complex multilevel correlation structures in longitudinal data. - **New Additive Kernel**: L - DKGPR introduces a new additive kernel that takes into account both time - varying effects and time - invariant effects and can effectively model multilevel correlation structures. - **Efficient Training Algorithm**: The author proposed an efficient algorithm based on latent space inducing points and variational inference to train L - DKGPR, enabling the model to operate efficiently on large - scale datasets as well. Through these innovations, the experimental results of L - DKGPR on multiple benchmark datasets show that it is significantly superior to existing longitudinal data analysis methods.