Pangenome-spanning epistasis and coselection analysis via de Bruijn graphs
Juri Kuronen,Samuel T. Horsfield,Anna K. Pöntinen,Sudaraka Mallawaarachchi,Sergio Arredondo-Alonso,Harry Thorpe,Rebecca A. Gladstone,Rob J.L. Willems,Stephen D. Bentley,Nicholas J. Croucher,Johan Pensar,John A. Lees,Gerry Tonkin-Hill,Jukka Corander
DOI: https://doi.org/10.1101/gr.278485.123
IF: 9.438
2024-08-22
Genome Research
Abstract:Studies of bacterial adaptation and evolution are hampered by the difficulty of measuring traits such as virulence, drug resistance, and transmissibility in large populations. In contrast, it is now feasible to obtain high-quality complete assemblies of many bacterial genomes thanks to scalable high-accuracy long-read sequencing technologies. To exploit this opportunity, we introduce a phenotype- and alignment-free method for discovering coselected and epistatically interacting genomic variation from genome assemblies covering both core and accessory parts of genomes. Our approach uses a compact colored de Bruijn graph to approximate the intragenome distances between pairs of loci for a collection of bacterial genomes to account for the impacts of linkage disequilibrium (LD). We demonstrate the versatility of our approach to efficiently identify associations between loci linked with drug resistance and adaptation to the hospital niche in the major human bacterial pathogens Streptococcus pneumoniae and Enterococcus faecalis .
genetics & heredity,biochemistry & molecular biology,biotechnology & applied microbiology