Machine learning surveillance of foodborne infectious diseases using wastewater microbiome, crowdsourced, and environmental data

Seungdae Oh,Haeil Byeon,Jonathan Wijaya
DOI: https://doi.org/10.1016/j.watres.2024.122282
2024-08-21
Abstract:Clostridium perfringens (CP) is a common cause of foodborne infection, leading to significant human health risks and a high economic burden. Thus, effective CP disease surveillance is essential for preventive and therapeutic interventions; however, conventional practices often entail complex, resource-intensive, and costly procedures. This study introduced a data-driven machine learning (ML) modeling framework for CP-related disease surveillance. It leveraged an integrated dataset of municipal wastewater microbiome (e.g., CP abundance), crowdsourced (CP-related web search keywords), and environmental data. Various optimization strategies, including data integration, data normalization, model selection, and hyperparameter tuning, were implemented to improve the ML modeling performance, leading to enhanced predictions of CP cases over time. Explainable artificial intelligence methods identified CP abundance as the most reliable predictor of CP disease cases. Multi-omics subsequently revealed the presence of CP and its genotypes/toxinotypes in wastewater, validating the utility of microbiome-data-enabled ML surveillance for foodborne diseases. This ML-based framework thus exhibits significant potential for complementing and reinforcing existing disease surveillance systems.
What problem does this paper attempt to address?