Abstract:Understanding and accurately estimating epidemiological delay distributions is important for public health policy. These estimates directly influence epidemic situational awareness, control strategies, and resource allocation. In this study, we explore challenges in estimating these distributions, including truncation, interval censoring, and dynamical biases. Despite their importance, these issues are frequently overlooked in the current literature, often resulting in biased conclusions. This study aims to shed light on these challenges, providing valuable insights for epidemiologists and infectious disease modellers. Our work motivates comprehensive approaches for accounting for these issues based on the underlying theoretical concepts. We also discuss simpler methods that are widely used, which do not fully account for known biases. We evaluate the statistical performance of these methods using simulated exponential growth and epidemic scenarios informed by data from the 2014-2016 Sierra Leone Ebola virus disease epidemic. Our findings highlight that using simpler methods can lead to biased estimates of vital epidemiological parameters. An approximate-latent-variable method emerges as the best overall performer, while an efficient, widely implemented interval-reduced-censoring-and-truncation method was only slightly worse. Other methods, such as a joint-primary-incidence-and-delay method and a dynamic-correction method, demonstrated good performance under certain conditions, although they have inherent limitations and may not be the best choice for more complex problems. Despite presenting a range of methods that performed well in the contexts we evaluated, residual biases persisted, predominantly due to the simplifying assumption that the distribution of event time within the censoring interval follows a uniform distribution; instead, this distribution should depend on epidemic dynamics. However, in realistic scenarios with daily censoring, these biases appeared minimal. This study underscores the need for caution when estimating epidemiological delay distributions in real-time, provides an overview of the theory that practitioners need to keep in mind when doing so with useful tools to avoid common methodological errors, and points towards areas for future research.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is how to accurately estimate the epidemiological delay distribution of infectious diseases. Specifically, the paper focuses on the following aspects: 1. **The influence of truncation and interval censoring**: - The paper points out that when estimating the epidemiological delay distribution, truncation and interval censoring are two important sources of bias. If these biases are not taken into account, it will lead to biased estimates of key epidemiological parameters. - For example, right truncation means that we can only observe events that have occurred and been reported, and cannot obtain information on unoccurred events. This will cause the data to be biased towards shorter time intervals. 2. **The influence of dynamical biases**: - In different stages of epidemic growth or decline, some delay distributions may be affected by dynamic changes. For example, in the rapid epidemic growth stage, shorter delays are more likely to be observed, while the opposite is true in the epidemic decline stage. - Dynamical biases are equivalent to the effect of right truncation in the exponential growth stage, but their treatment methods are different, so a clear distinction is required. 3. **The limitations of existing methods**: - Although some existing methods can adjust for censoring and truncation, they may have deficiencies in practical applications, such as failing to fully consider all types of biases or performing poorly in specific situations. - For example, some commonly used methods assume that event times are uniformly distributed within the censoring interval, but in fact this assumption is not always valid, especially in the case of daily censoring. 4. **Providing improved methods and tools**: - The paper proposes some improved methods, such as the approximate latent variable method, and evaluates them on simulated and real data (such as the Ebola virus epidemic data in Sierra Leone from 2014 - 2016). - These methods aim to better handle censoring, truncation, and dynamical biases, thereby improving the accuracy of estimating the epidemiological delay distribution. In summary, this paper aims to provide a comprehensive and robust method to estimate the epidemiological delay distribution by systematically analyzing and evaluating the advantages and disadvantages of existing methods, in order to reduce estimation biases and improve the accuracy of public health decision - making.

Estimating epidemiological delay distributions for infectious diseases

Best practices for estimating and reporting epidemiological delay distributions of infectious diseases using public health surveillance and healthcare data

Best practices for estimating and reporting epidemiological delay distributions of infectious diseases

Robust estimation of end-of-outbreak probabilities in the presence of delayed and incomplete case reporting

Intervention Strategies for Epidemics: Does Ignoring Time Delay Lead to Incorrect Predictions?

Addressing delayed case reporting in infectious disease forecast modeling

Modelling reporting delays for disease surveillance data

Avoidable errors in the modeling of outbreaks of emerging pathogens, with special reference to Ebola

Reporting delays: a widely neglected impact factor in COVID-19 forecasts

Temporal and probabilistic comparisons of epidemic interventions

Forecasting and Uncertainty in Modeling the 2014-2015 Ebola Epidemic in West Africa

Improved estimation of the effective reproduction number with heterogeneous transmission rates and reporting delays

Data rectification to account for delays in reporting disease incidence with an application to forecasting COVID-19 cases

Bayesian inference for the onset time and epidemiological characteristics of emerging infectious diseases

Optimal algorithms for controlling infectious diseases in real time using noisy infection data

A simulation-based approach for estimating the time-dependent reproduction number from temporally aggregated disease incidence time series data

Multivariate Hierarchical Frameworks for Modelling Delayed Reporting in Count Data

Forecasting Epidemics Through Nonparametric Estimation of Time-Dependent Transmission Rates Using the SEIR Model

Detecting critical slowing down in high-dimensional epidemiological systems

Calculation of Epidemic First Passage and Peak Time Probability Distributions

Correcting delayed reporting of covid‐19 using the generalized‐dirichlet‐multinomial method