The utility of whole-genome sequencing to identify likely transmission pairs for pathogens with slow and variable evolution

Anthony J Wood,Clare H Benton,Richard J Delahay,Glenn Marion,Eleftheria Palkopoulou,Christopher M Pooley,Rowland R Kao
DOI: https://doi.org/10.1101/2024.05.06.592672
2024-09-02
Abstract:Pathogen whole-genome sequencing (WGS) has been used to track the transmission of infectious diseases in extraordinary detail, especially for pathogens that undergo fast and steady evolution, as is the case with many RNA viruses. However, for other pathogens evolution is less predictable, making interpretation of these data to inform our understanding of their epidemiology more challenging and the value of densely collected pathogen genome data uncertain. Here, we assess the utility of WGS for one such pathogen, in the "who-infected-whom" identification problem. We study samples from hosts (130 cattle, 111 badgers) with confirmed infection of M. bovis (causing bovine Tuberculosis), which has an estimated clock rate as slow as ~0.1-1 variations per year. For each potential pathway between hosts, we calculate the likelihood that such a transmission event occurred. This is informed by an epidemiological model of transmission, and host life history data. By including WGS data, we shrink the number of plausible pathways significantly, relative to those deemed likely on the basis of life history data alone. Despite our uncertainty relating to the evolution of M. bovis, the WGS data are therefore a valuable adjunct to epidemiological investigations, especially for wildlife species whose life history data are sparse.
Ecology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is whether whole - genome sequencing (WGS) can be effectively used to identify potential transmission paths in the case of slow and unpredictable pathogen evolution, especially for Mycobacterium bovis (M. bovis), the pathogen that causes bovine tuberculosis (bTB). Specifically, the researchers wanted to evaluate the application value of WGS in the question of "who infected whom", especially in multi - host systems, such as the transmission between domestic animals (e.g. cattle) and wild animals (e.g. badgers). ### Background and Problem Statement - **Background**: WGS has been widely used to track the transmission paths of infectious diseases, especially in the study of rapidly and stably evolving pathogens (such as RNA viruses). However, for slowly evolving and variable pathogens, such as M. bovis, the value and interpretation difficulty of WGS data increase. - **Problem**: The evolution rate of M. bovis is extremely low, estimated to be only 0.1 to 1 mutation per year. This makes it very difficult to infer specific infection paths based on WGS data. Therefore, the researchers hope to evaluate the practical utility of WGS in identifying potential transmission paths by combining WGS data with host life - history data (such as birth time, death time, location and migration history). ### Research Methods - **Sample Selection**: The researchers collected M. bovis - infected samples from 130 cattle and 111 badgers. These samples were from a long - term research area (Woodchester Park). - **Data Processing**: Samples with unclear sampling time or duplicates were excluded, and finally 241 samples were retained. - **Relative Likelihood Calculation**: For each potential transmission path between pairs of hosts, the researchers calculated its relative likelihood. This includes: - **Based on Life - history Data**: Consider the survival time, location and migration history of the host. - **Combined with WGS Data**: Consider the single - nucleotide polymorphisms (SNPs) differences in the M. bovis gene sequences between hosts. ### Main Findings - **Reduce Possible Paths**: By combining WGS data, the researchers significantly reduced the number of transmission paths considered possible. Many paths that seemed reasonable based on life - history data were excluded because their genetic distances were extremely unlikely under the evolutionary characteristics of M. bovis. - **Improve Discrimination Ability**: WGS data showed stronger ability in distinguishing different transmission paths. Standard deviation analysis showed that after combining WGS data, the relative likelihood distribution of paths was more dispersed, indicating that WGS data improved the discrimination of different paths. - **Case - specific Analysis**: The researchers demonstrated through specific cases how WGS data can verify or refute path hypotheses based on life - history data. For example, some paths that seemed reasonable based on life - history data were excluded due to excessive genetic distance; while some originally less likely paths were reconsidered due to consistent genetic distance. ### Conclusion Although the evolution rate of M. bovis is low and unpredictable, WGS data is still an important supplementary tool for epidemiological investigations. By combining WGS data with host life - history data, the researchers can more accurately identify potential transmission paths, especially in multi - host systems. This not only helps to understand the transmission mechanism of the disease, but also provides a scientific basis for formulating effective disease control strategies.