Gradient Boosting: A Computationally Efficient Alternative to Markov Chain Monte Carlo Sampling for Fitting Large Bayesian Spatio-Temporal Binomial Regression Models

Rongjie Huang,Christopher McMahan,Brian Herrin,Alexander McLain,Bo Cai,Stella Self
DOI: https://doi.org/10.1016/j.idm.2024.09.008
2024-10-06
Infectious Disease Modelling
Abstract:Disease forecasting and surveillance often involve fitting models to a tremendous volume of historical testing data collected over space and time. Bayesian spatio-temporal regression models fit with Markov chain Monte Carlo (MCMC) methods are commonly used for such data. When the spatio-temporal support of the model is large, implementing an MCMC algorithm becomes a significant computational burden. This research proposes a computationally efficient gradient boosting algorithm for fitting a Bayesian spatio-temporal mixed effects binomial regression model. We demonstrate our method on a disease forecasting model and compare it to a computationally optimized MCMC approach. Both methods are used to produce monthly forecasts for Lyme disease, anaplasmosis, ehrlichiosis, and heartworm disease in domestic dogs for the contiguous United States. The data have a spatial support of 3,108 counties and a temporal support of 108-138 months with 71 to 135 million test results. The proposed estimation approach is several orders of magnitude faster than the optimized MCMC algorithm, with a similar mean absolute prediction error.
infectious diseases,mathematical & computational biology
What problem does this paper attempt to address?