Abstract:Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochasticity, this high dimensional calibration can be computationally prohibitive. This paper presents a random forest based surrogate modeling technique to accelerate the evaluation of ABMs and demonstrates its use to calibrate an epidemiological ABM named CityCOVID via Markov chain Monte Carlo (MCMC). The technique is first outlined in the context of CityCOVID's quantities of interest, namely hospitalizations and deaths, by exploring dimensionality reduction via temporal decomposition with principal component analysis (PCA) and via sensitivity analysis. The calibration problem is then presented and samples are generated to best match COVID-19 hospitalization and death numbers in Chicago from March to June in 2020. These results are compared with previous approximate Bayesian calibration (IMABC) results and their predictive performance is analyzed showing improved performance with a reduction in computation.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is how to calibrate agent - based models (ABM) efficiently and accurately, especially for epidemiological models with high randomness and multi - parameter characteristics. Specifically, the paper focuses on the following aspects: 1. **Computational efficiency**: Traditional agent - based models (such as CityCOVID) usually require a large amount of computational resources when performing parameter calibration due to their high - dimensionality and randomness. This makes accurate calibration very time - consuming and expensive. 2. **Accuracy of parameter calibration**: In order to improve the prediction performance, accurate parameter calibration of the model is required. However, due to the high randomness and complexity of ABM, it is often difficult to achieve calibration directly using traditional methods. 3. **Handling randomness**: The randomness in the ABM model increases the difficulty of calibration. How to ensure the accuracy of the calibration results while maintaining the randomness of the model is a key challenge. To solve these problems, the paper proposes a global surrogate model method based on random forests, combined with Bayesian calibration techniques (through the Markov chain Monte Carlo method, i.e., MCMC) to accelerate the evaluation and calibration of ABM. The specific steps are as follows: - **Dimensionality reduction**: Perform time - decomposition on the target quantities (such as the number of hospitalizations and the number of deaths) through principal component analysis (PCA) to reduce the data dimension. - **Sensitivity analysis**: Use the sensitivity measures built into the random forest (such as Gini importance, permutation importance, and Sobol index) to identify the parameters that have the greatest impact on the output, thereby further reducing the dimension of the parameter space. - **Surrogate model construction**: Train a random forest regression model to map the reduced - dimension parameters to the output coefficients, thereby establishing an efficient surrogate model. - **Bayesian calibration**: Use the MCMC method to estimate the posterior distribution of ABM parameters based on the samples generated by the surrogate model to achieve efficient and accurate parameter calibration. Through this method, the paper has successfully reduced the computational burden, improved the calibration accuracy, and demonstrated superior performance in simulating hospitalization and death data during the COVID - 19 epidemic in the Chicago area in 2020. Compared with the previous approximate Bayesian calibration (IMABC) method, the new method not only improves the prediction performance but also significantly reduces the computation time. ### Formula summary 1. **PCA decomposition**: \[ \begin{bmatrix} h_{1,1} & \cdots & h_{1,n} \\ h_{2,1} & \cdots & h_{2,n} \\ \vdots & \ddots & \vdots \\ h_{m,1} & \cdots & h_{m,n} \end{bmatrix} \rightarrow \begin{bmatrix} \alpha_{1,1} \\ \alpha_{2,1} \\ \vdots \\ \alpha_{m,1} \end{bmatrix} \odot \vec{c}_1 + \cdots + \begin{bmatrix} \alpha_{1,2n} \\ \alpha_{2,2n} \\ \vdots \\ \alpha_{m,2n} \end{bmatrix} \odot \vec{c}_{2n} \] where \(h_{i,j}\) and \(d_{i,j}\) are the number of hospitalizations and the number of deaths of the \(i\)-th group of parameters at the \(j\)-th time step respectively, \(\vec{c}_j\in\mathbb{R}^{2n}\) is the PCA component, and \(\alpha_{i,j}\) is the coefficient multiplied by the component \(\vec{c}_j\). 2. **Bayesian likelihood function**: \[ L(\vec{h}^\circ, \vec{d}^\circ | \vec{\theta})=\frac{1}{(2\pi)^{n/2}\sigma_h^n} \

Bayesian calibration of stochastic agent based model via random forest

Country-Wide Agent-Based Epidemiological Modeling Using 17 Million Individual-Level Microdata

BayCANN: Streamlining Bayesian Calibration with Artificial Neural Network Metamodeling

Considerations in Bayesian agent-based modeling for the analysis of COVID-19 data

Using neural networks to calibrate agent based models enables improved regional evidence for vaccine strategy and policy

Epidemiological Model Calibration via Graybox Bayesian Optimization

Calibration of stochastic, agent-based neuron growth models with Approximate Bayesian Computation

A Bayesian model calibration framework for stochastic compartmental models with both time-varying and time-invariant parameters

Real-Time Epidemiology and Acute Care Need Monitoring and Forecasting for COVID-19 via Bayesian Sequential Monte Carlo-Leveraged Transmission Models

CoV-ABM: A stochastic discrete-event agent-based framework to simulate spatiotemporal dynamics of COVID-19

Bridging the Micro and Macro: Calibration of Agent-Based Model Using Mean-Field Dynamics

Noise-free comparison of stochastic agent-based simulations using common random numbers

Neural parameter calibration and uncertainty quantification for epidemic forecasting

All Models Are Useful: Bayesian Ensembling for Robust High Resolution COVID-19 Forecasting

Agent-based modeling of the COVID-19 pandemic in Florida

Agent based simulators for epidemic modelling: Simulating larger models using smaller ones

On the calibration of compartmental epidemiological models

Neural parameter calibration for large-scale multiagent models

Recalibrating probabilistic forecasts of epidemics