A bivariate spatial extreme mixture model for unreplicated heavy metal soil contamination

M. Daniela Cuba,Marian Scott,Benjamin P. Marchant,Daniela Castro-Camilo
2024-02-22
Abstract:Geostatistical models for multivariate applications such as heavy metal soil contamination work under Gaussian assumptions and may result in underestimated extreme values and misleading risk assessments (Marchant et al, 2011). A more suitable framework to analyse extreme values is extreme value theory (EVT). However, EVT relies on replications in time, which are generally not available in geochemical datasets. Therefore, using EVT to map soil contamination requires adaptation to be used in the usual single-replicate data framework of soil surveys. We propose a bivariate spatial extreme mixture model to model the body and tail of contaminant pairs, where the tails are described using a stationary generalised Pareto distribution. We demonstrate the performance of our model using a simulation study and through modelling bivariate soil contamination in the Glasgow conurbation. Model results are given as maps of predicted marginal concentrations and probabilities of joint exceedance of soil guideline values. Marginal concentration maps show areas of elevated lead levels along the Clyde River and elevated levels of chromium around the south and southeast villages such as East Kilbride and Wishaw. The joint probability maps show higher probabilities of joint exceedance to the south and southeast of the city centre, following known legacy contamination regions in the Clyde River basin.
Applications
What problem does this paper attempt to address?
This paper focuses on modeling and analyzing extreme values of heavy metal pollution in soil, especially for geostatistical situations without time-repeated data. Traditional models such as the Gaussian assumption may underestimate extreme values, leading to inaccurate risk assessments. The paper proposes a bivariate spatial extreme mixture model that combines non-extreme and extreme distributions to simultaneously describe the body and tail of pollutant pairs. The model utilizes the generalized Pareto distribution to describe the tail and handles spatial dependence through a co-regionalization framework. Specifically, the model decomposes the pollutant distribution into two parts, one representing the body of natural processes and background pollution, and the other representing extreme concentrations from anthropogenic or natural processes. In the absence of time series data, the paper adopts extreme value theory (EVT) but requires an adaptation to the framework of single-sampling data. The inference of the model utilizes the integrated nested Laplace approximation (INLA) method, which is an efficient algorithm applicable to latent Gaussian models. The paper demonstrates the performance of the model through simulation studies and empirical cases in the Glasgow metropolitan area. The results are presented in the form of predicted marginal concentration maps and joint exceedance probability maps, revealing high-risk areas for pollutants such as lead and chromium. This approach is crucial for understanding the extent and risk of heavy metal pollution in urban and densely populated areas, aiding the formulation of public health prevention measures.