A mechanistic modeling and estimation framework for environmental pathogen surveillance

Matthew Wascher,Colin J Klaus,Chance Alvarado,Jenny Panescu,Mikkel Quam,Karen C Dannemiller,Joseph H Tien
DOI: https://doi.org/10.1016/j.mbs.2024.109257
2024-08-20
Abstract:Environmental pathogen surveillance is a promising disease surveillance modality that has been widely adopted for SARS-CoV-2 monitoring. The highly variable nature of environmental pathogen data is a challenge for integrating these data into public health response. One source of this variability is heterogeneous infection both within an individual over the course of infection as well as between individuals in their pathogen shedding over time. We present a mechanistic modeling and estimation framework for connecting environmental pathogen data to the number of infected individuals. Infected individuals are modeled as shedding pathogen into the environment via a Poisson process whose rate parameter λt varies over the course of their infection. These shedding curves λt are themselves random, allowing for variation between individuals. We show that this results in a Poisson process for environmental pathogen levels with rate parameter a function of the number of infected individuals, total shedding over the course of infection, and pathogen removal from the environment. Theoretical results include determination of identifiable parameters for the model from environmental pathogen data and simple, explicit formulas for the likelihood for particular choices of individual shedding curves. We give a two step Bayesian inference framework, where the first step corresponds to calibration from data where the number of infected individuals is known, followed by an estimation step from environmental surveillance data when the number of infected individuals is unknown. We apply this modeling and estimation framework to synthetic data, as well as to an empirical case study of SARS-CoV-2 in environmental dust collected from isolation rooms housing university students. Both the synthetic data and empirical case study indicate high inter-individual variation in shedding, leading to wide credible intervals for the number of infected individuals. We examine how uncertainty in estimates of the number of infected individuals from environmental pathogen levels scales with the true number of infected individuals and model misspecification. While credible intervals for the number of infected individuals are wide, our results suggest that distinguishing between no infection and small-to-moderate levels of infection (≈10 infected individuals) may be possible, and that it is broadly possible to differentiate between moderate (≈40) and high (≈200) numbers of infected individuals.
What problem does this paper attempt to address?