Comparative whole-genome sequence analysis of Mycobacterium tuberculosis isolated from pulmonary tuberculosis and tuberculous lymphadenitis patients in Northwest Ethiopia

Daniel Mekonnen,Abaineh Munshea,Endalkachew Nibret,Bethlehem Adnew,Silvia Herrera-Leon,Aranzazu Amor Aramendia,Agustín Benito,Estefanía Abascal,Camille Jacqueline,Abraham Aseffa,Laura Herrera-Leon
DOI: https://doi.org/10.3389/fmicb.2023.1211267
2023-06-30
Abstract:Background: Tuberculosis (TB), caused by the Mycobacterium tuberculosis complex (MTBC), is a chronic infectious disease with both pulmonary and extrapulmonary forms. This study set out to investigate and compare the genomic diversity and transmission dynamics of Mycobacterium tuberculosis (Mtb) isolates obtained from tuberculous lymphadenitis (TBLN) and pulmonary TB (PTB) cases in Northwest Ethiopia. Methods: A facility-based cross-sectional study was conducted using two groups of samples collected between February 2021 and June 2022 (Group 1) and between June 2020 and June 2022 (Group 2) in Northwest Ethiopia. Deoxyribonucleic acid (DNA) was extracted from 200 heat-inactivated Mtb isolates. Whole-genome sequencing (WGS) was performed from 161 isolates having ≥1 ng DNA/μl using Illumina NovaSeq 6000 technology. Results: From the total 161 isolates sequenced, 146 Mtb isolates were successfully genotyped into three lineages (L) and 18 sub-lineages. The Euro-American (EA, L4) lineage was the prevailing (n = 100; 68.5%) followed by Central Asian (CAS, L3, n = 43; 25.3%) and then L7 (n = 3; 2.05%). The L4.2.2.ETH sub-lineage accounted for 19.9%, while Haarlem estimated at 13.7%. The phylogenetic tree revealed distinct Mtb clusters between PTB and TBLN isolates even though there was no difference at lineages and sub-lineages levels. The clustering rate (CR) and recent transmission index (RTI) for PTB were 30 and 15%, respectively. Similarly, the CR and RTI for TBLN were 31.1 and 18 %, respectively. Conclusion and recommendations: PTB and TBLN isolates showed no Mtb lineages and sub-lineages difference. However, at the threshold of five allelic distances, Mtb isolates obtained from PTB and TBLN form distinct complexes in the phylogenetic tree, which indicates the presence of Mtb genomic variation among the two clinical forms. The high rate of clustering and RTI among TBLN implied that TBLN was likely the result of recent transmission and/or reactivation from short latency. Hence, the high incidence rate of TBLN in the Amhara region could be the result of Mtb genomic diversity and rapid clinical progression from primary infection and/or short latency. To validate this conclusion, a similar community-based study with a large sample size and better sampling technique is highly desirable. Additionally, analysis of genomic variants other than phylogenetic informative regions could give insightful information. Combined analysis of the host and the pathogen genome (GXG) together with environmental (GxGxE) factors could give comprehensive co-evolutionary information.
What problem does this paper attempt to address?