Mixed-Stationary Gaussian Process for Flexible Non-Stationary Modeling of Spatial Outcomes

Leo L. Duan,Xia Wang,Rhonda D. Szczesniak
DOI: https://doi.org/10.48550/arXiv.1807.06656
2018-07-18
Abstract:Gaussian processes (GPs) are commonplace in spatial statistics. Although many non-stationary models have been developed, there is arguably a lack of flexibility compared to equipping each location with its own parameters. However, the latter suffers from intractable computation and can lead to overfitting. Taking the instantaneous stationarity idea, we construct a non-stationary GP with the stationarity parameter individually set at each location. Then we utilize the non-parametric mixture model to reduce the effective number of unique parameters. Different from a simple mixture of independent GPs, the mixture in stationarity allows the components to be spatial correlated, leading to improved prediction efficiency. Theoretical properties are examined and a linearly scalable algorithm is provided. The application is shown through several simulated scenarios as well as the massive spatiotemporally correlated temperature data.
Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the lack of flexibility in existing non - stationary Gaussian processes (NS - GPs) when dealing with spatial data. Specifically, although existing non - stationary models can generate non - stationarity, their changes in stationarity between different regions are not flexible enough. This leads to the possibility of over - smoothing or underestimating prediction uncertainty when using these models for spatial prediction. These problems are particularly important for spatial prediction because over - estimating spatial autocorrelation will lead to over - smoothing and underestimating prediction uncertainty; conversely, underestimating spatial autocorrelation will lead to excessive variance and inefficiency of the interpolator. To solve these problems, the paper proposes a Mixed - Stationary Gaussian Process (MSGP) model. This model constructs a non - stationary Gaussian process by setting separate stationary parameters at each location and uses a non - parametric mixture model to reduce the number of effective parameters. Different from a simple independent Gaussian process mixture model, this mixed stationarity allows for spatial correlation between components, thereby improving prediction efficiency. Specifically, the paper solves the following two key problems: 1. **Parameter Estimation Problem**: Instantaneous or piecewise stationary models usually require a large number of parameters, resulting in high variance in parameter estimation. Unless the number of stationary regions is artificially limited, it is difficult to avoid this problem. To solve this problem, the paper uses non - parametric Bayesian methods (such as mixture models) to reduce the number of parameters, so that different regions can share the same parameters without being continuous. This information borrowing significantly reduces the number of required parameters. 2. **Prediction Problem**: A simple piecewise structure has no correlation between any two regions, which can only use part of the data during interpolation and is a serious problem. Although the boundary can be smoothed by model averaging or marginalization, the paper points out that the prediction efficiency of this method is not as good as that of the method directly including Gaussian covariance. The proposed non - stationary model can be regarded as a mixture model depending on Gaussian processes, with dense matrix Gaussian correlation for a given component assignment. This makes it possible to cluster spatial data while maintaining prediction efficiency. Through the above methods, the paper provides a more flexible and efficient non - stationary Gaussian process modeling method when dealing with large - scale spatio - temporal correlated data.