A comparison of statistical and machine learning models for spatio-temporal prediction of ambient air pollutant concentrations in Scotland

Qiangqiang Zhu,Duncan Lee,Oliver Stoner
DOI: https://doi.org/10.1007/s10651-024-00635-5
2024-11-16
Environmental and Ecological Statistics
Abstract:The spatio-temporal prediction of air pollutant concentrations is vital for assessing regulatory compliance and for producing exposure estimates in epidemiological studies. Numerous approaches have been utilised for making such predictions, including land use regression models, additive models, spatio-temporal smoothing models and machine learning prediction algorithms. However, relatively few studies have compared the predictive performance of these models thoroughly, which is one of the novel contributions of this paper. For the specific challenge of predicting monthly average concentrations of NO , PM and in Scotland, we find that random forests typically outperform (or are as good as) more traditional statistical prediction approaches. Additionally, we utilise the best performing model to provide a new data resource, namely, predictions of monthly average concentrations (with uncertainty quantification) of the above pollutants on a regular 1 km grid for all of Scotland between 2016 and 2020.
environmental sciences,statistics & probability,mathematics, interdisciplinary applications
What problem does this paper attempt to address?