Spatiotemporal modeling of European paleoclimate using doubly sparse Gaussian processes

Seth D. Axen,Alexandra Gessner,Christian Sommer,Nils Weitzel,Álvaro Tejero-Cantero
DOI: https://doi.org/10.48550/arXiv.2211.08160
2022-11-15
Abstract:Paleoclimatology -- the study of past climate -- is relevant beyond climate science itself, such as in archaeology and anthropology for understanding past human dispersal. Information about the Earth's paleoclimate comes from simulations of physical and biogeochemical processes and from proxy records found in naturally occurring archives. Climate-field reconstructions (CFRs) combine these data into a statistical spatial or spatiotemporal model. To date, there exists no consensus spatiotemporal paleoclimate model that is continuous in space and time, produces predictions with uncertainty, and can include data from various sources. A Gaussian process (GP) model would have these desired properties; however, GPs scale unfavorably with data of the magnitude typical for building CFRs. We propose to build on recent advances in sparse spatiotemporal GPs that reduce the computational burden by combining variational methods based on inducing variables with the state-space formulation of GPs. We successfully employ such a doubly sparse GP to construct a probabilistic model of European paleoclimate from the Last Glacial Maximum (LGM) to the mid-Holocene (MH) that synthesizes paleoclimate simulations and fossilized pollen proxy data.
Machine Learning,Applications
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to construct a spatiotemporally continuous paleoclimate model with uncertainty that can synthesize paleoclimate simulations and fossil pollen proxy data. Specifically, the research aims to address the following issues: 1. **Lack of a Consensus Model**: Currently, there is no globally - spatiotemporally continuous, probabilistic paleoclimate model that can combine most of the available proxy data and simulation data and can be queried at any location and time. 2. **Computational Complexity**: Traditional Gaussian Process (GP) models have high computational complexity when dealing with large - scale data and are difficult to apply to actual paleoclimate reconstructions. To solve these problems, the authors propose a method based on doubly sparse Gaussian processes. This method reduces the computational burden by introducing inducing variables and Markov state - space representations. This method enables the model to perform spatiotemporal modeling of the annual average temperature in Europe from the Last Glacial Maximum (LGM; approximately 21,000 years ago) to the mid - Holocene (MH; approximately 6,000 years ago). ### Main Contributions - **Application of Sparse Gaussian Processes**: Using sparse variational Gaussian processes combined with Markov structures significantly reduces the computational complexity, making it suitable for large - scale paleoclimate datasets. - **Integration of Multiple Data Sources**: Combining paleoclimate simulations and fossil pollen proxy data to construct a spatiotemporally continuous probability model. - **Uncertainty Estimation**: The model not only provides predicted values but also comes with calibrated uncertainty estimates, enhancing the reliability of the results. ### Model Validation The authors verified the effectiveness of the model through the "leave - one - time - slice - out" method and showed the prediction errors and confidence intervals of the model in different time periods. The experimental results show that the model performs well in terms of prediction accuracy and uncertainty estimation. ### Future Prospects The authors plan to further improve the model, including: - Using measurement data from the early 1900s for spatial interpolation instead of relying on weighted - interpolation simulation data. - Introducing non - zero - parameter prior mean functions to capture the remaining large - scale spatiotemporal trends. - Using independent likelihood functions for different data sources (such as proxy data and simulation data). - Considering the dating uncertainty of proxy data (such as pollen). In summary, this research lays the foundation for constructing a global paleoclimate consensus model and provides an effective framework that can be applied in fields such as archaeology and paleoecology.