A Flexible Framework for Local-Level Estimation of the Effective Reproductive Number in Geographic Regions with Sparse Data

MD SAKHAWAT HOSSAIN,Ravi Goyal,Natasha K Martin,Victor DeGruttola,Mohammad Mihrab Chowdhury,Christopher McMahan,Lior Rennert
DOI: https://doi.org/10.1101/2024.11.06.24316859
2024-11-07
Abstract:Background: Our research focuses on local level estimation of the effective reproductive number, which describes the transmissibility of an infectious disease and represents the average number of individuals one infectious person infects at a given time. The ability to accurately estimate the infectious disease reproductive number in geographically granular regions is critical for disaster planning and resource allocation. However, not all regions have sufficient infectious disease outcome data for estimation. Methods: We propose a two-step approach that incorporates existing Rt estimation procedures (EpiEstim, EpiFilter, and EpiNow2) using data from geographic regions with sufficient data (step 1) into a covariate-adjusted Bayesian Integrated Nested Laplace Approximation (INLA) spatial model to predict R_t in regions with sparse or missing data (step 2). Our flexible framework effectively allows for implementing any existing estimation procedure for R_t in regions with coarse or entirely missing data. We perform external validation to evaluate predictive performance. Results: We applied our method to estimate R_t using data from South Carolina (SC) counties and ZIP codes during the first COVID-19 wave (Wave 1, June 16, 2020-August 31, 2020) and the second wave (Wave 2, December 16, 2020-March 02, 2021). Among the three methods used in the first step, EpiNow2 yielded the highest accuracy of Rt prediction in the regions with entirely missing data. Median county-level percentage agreement (PA) was 90.9% (IQR: 89.9-92.0%) and 92.5% (IQR: 91.6-93.4%) for Wave 1 and 2, respectively. Median zip code-level PA was 95.2% (Interquartile Range, IQR: 94.4-95.7%) and 96.5% (IQR: 95.8-97.1%) for Wave 1 and 2, respectively. Using EpiEstim and EpiFilter yielded median PA ranging from 81.9%-90.0% and 87.2%-92.1% (respectively) across both waves and geographic granularities. Conclusion: These findings demonstrate that the proposed methodology is a useful tool for small-area estimation of R_t, as our flexible framework yields high prediction accuracy for regions with entirely missing data regardless of the (step 1) estimation procedure used.
Infectious Diseases (except HIV/AIDS)
What problem does this paper attempt to address?