Abstract:In the Big Data era, with the ubiquity of geolocation sensors in particular, massive datasets exhibiting a possibly complex spatial dependence structure are becoming increasingly available. In this context, the standard probabilistic theory of statistical learning does not apply directly and guarantees of the generalization capacity of predictive rules learned from such data are left to establish. We analyze here the simple Kriging task from a statistical learning perspective, i.e. by carrying out a nonparametric finite-sample predictive analysis. Given $d\geq 1$ values taken by a realization of a square integrable random field $X=\{X_s\}_{s\in S}$, $S\subset \mathbb{R}^2$, with unknown covariance structure, at sites $s_1,\; \ldots,\; s_d$ in $S$, the goal is to predict the unknown values it takes at any other location $s\in S$ with minimum quadratic risk. The prediction rule being derived from a training spatial dataset: a single realization $X'$ of $X$, independent from those to be predicted, observed at $n\geq 1$ locations $\sigma_1,\; \ldots,\; \sigma_n$ in $S$. Despite the connection of this minimization problem with kernel ridge regression, establishing the generalization capacity of empirical risk minimizers is far from straightforward, due to the non independent and identically distributed nature of the training data $X'_{\sigma_1},\; \ldots,\; X'_{\sigma_n}$ involved in the learning procedure. In this article, non-asymptotic bounds of order $O_{\mathbb{P}}(1/\sqrt{n})$ are proved for the excess risk of a plug-in predictive rule mimicking the true minimizer in the case of isotropic stationary Gaussian processes, observed at locations forming a regular grid in the learning stage. These theoretical results are illustrated by various numerical experiments, on simulated data and on real-world datasets.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is, in the context of the big data era, how to use statistical learning theory to handle prediction tasks in large - scale datasets with complex spatial - dependence structures, especially Simple Kriging tasks. Traditionally, the standard probabilistic - statistical learning theory is not directly applicable to this type of data because these data usually do not satisfy the independent and identically distributed (i.i.d.) assumption. Therefore, the main objective of the paper is to analyze the performance of Simple Kriging tasks from the perspective of statistical learning and attempt to establish the non - asymptotic bounds of its generalization ability. Specifically, the paper focuses on a square - integrable random field $X = \{X_s\}_{s\in S}$, where $S\subset\mathbb{R}^2$ and the covariance structure of $X$ is unknown. Suppose that the values of $X$ are observed at $d\geq1$ positions $s_1,\ldots,s_d\in S$, and the goal is to predict the value of $X$ at any other position $s\in S$ so as to minimize the prediction error. The training dataset is a single realization $X'$ independent of the values to be predicted, and is observed at $n\geq1$ positions $\sigma_1,\ldots,\sigma_n\in S$. Although this problem is related to Kernel Ridge Regression (KRR), since the training data $X'_{\sigma_1},\ldots,X'_{\sigma_n}$ are not independent and identically distributed, it is not easy to establish the generalization ability of the empirical - risk minimizer. The main contribution of the paper is to prove that in an isotropic stationary Gaussian process, when the observation points form a regular grid, the non - asymptotic bound of the excess risk of the plug - in prediction rule is $O_P\left(\frac{1}{\sqrt{n}}\right)$. These theoretical results are verified by numerical experiments, including experiments on simulated data and actual datasets, which lay the foundation for the development of statistical learning methods based on spatial data.

A Statistical Learning View of Simple Kriging

Kryging: Geostatistical analysis of large-scale datasets using Krylov subspace methods

A Divide-and-Conquer Bayesian Approach to Large-Scale Kriging

Reduced Basis Kriging for Big Spatial Fields

Explaining and Connecting Kriging with Gaussian Process Regression

On Prediction Properties of Kriging: Uniform Error Bounds and Robustness

Sparse Pseudo-input Local Kriging for Large Spatial Datasets with Exogenous Variables

Scalable Stochastic Kriging with Markovian Covariances

The Spatial Kernel Predictor based on Huge Observation Sets

A geostatistical model based on random walks to krige regions with irregular boundaries and holes

Properties and comparison of some Kriging sub-model aggregation methods

Exploring the Efficacy of Statistical and Deep Learning Methods for Large Spatial Datasets: A Case Study

An adaptive kriging method for solving nonlinear inverse statistical problems

Krigings over space and time based on latent low-dimensional structures

A type-II maximum-likelihood approach to Gaussian scale mixture-based sparse regression Kriging

Sample and Computationally Efficient Stochastic Kriging in High Dimensions

Resolution Adaptive Fixed Rank Kriging

Stochastic Local Interaction Model: Geostatistics without Kriging

Normalizing Basis Functions: Approximate Stationary Models for Large Spatial Data

Series ridge regression for spatial data on $\mathbb{R}^d$