rtestim: Time-varying reproduction number estimation with trend filtering

Jiaping Liu,Zhenglun Cai,Paul Gustafson,Daniel J. McDonald
DOI: https://doi.org/10.1101/2023.12.18.23299302
2024-07-12
Abstract:To understand the transmissibility and spread of infectious diseases, epidemiologists turn to estimates of the instantaneous reproduction number. While many estimation approaches exist, their utility may be limited. Challenges of surveillance data collection, model assumptions that are unverifiable with data alone, and computationally inefficient frameworks are critical limitations for many existing approaches. We propose a discrete spline-based approach that solves a convex optimization problem---Poisson trend filtering---using the proximal Newton method. It produces a locally adaptive estimator for instantaneous reproduction number estimation with heterogeneous smoothness. Our methodology remains accurate even under some process misspecifications and is computationally efficient, even for large-scale data. The implementation is easily accessible in a lightweight R package rtestim (dajmcdon.github.io/rtestim/).
What problem does this paper attempt to address?
This paper proposes a new method to estimate the time-varying reproduction number (instantaneous reproduction number R(t)), which is a key indicator for understanding the transmission power and epidemic dynamics of infectious diseases. Existing estimation methods may be limited by challenges in data collection, unverifiable model assumptions, and computational inefficiency. The researchers suggest a convex optimization method based on discrete splines - Poisson trend filtering, implemented using the proximal Newton method. This method can adapt to local changes, estimate instantaneous reproduction numbers with different smoothness, maintain accuracy even in cases of incorrectly specified processes, and demonstrate computational efficiency on large-scale data. The challenges mentioned in the paper include data quality, difficulties in validating model assumptions, and computational inefficiency. They have developed a lightweight R package called "rtestim" to facilitate the use of this method. Compared to existing popular methods such as EpiEstim, EpiNow2, and EpiFilter, the new method is more robust and resilient to modeling errors caused by incomplete data, delayed distribution, and incidence distribution. The performance of the method is demonstrated through examples, including the estimation of time-varying serial interval distributions using COVID-19 case data. Moreover, the paper presents optimization features of the algorithm, such as parallel processing of multiple tuning options for accelerated computation, built-in cross-validation for selecting optimal parameters, and the allowance of time-varying delayed distributions. In summary, the paper aims to address the problem of accurately and robustly estimating the time-varying instantaneous reproduction number to better understand and forecast the future development trends of infectious diseases, while overcoming limitations of existing methods.