InStrain enables population genomic analysis from metagenomic data and rigorous detection of identical microbial strains

Matthew R. Olm,Alexander Crits-Christoph,Keith Bouma-Gregson,Brian Firek,Michael J. Morowitz,Jillian F. Banfield
DOI: https://doi.org/10.1101/2020.01.22.915579
2020-01-23
Abstract:Abstract Coexisting microbial cells of the same species often exhibit genetic differences that can affect phenotypes ranging from nutrient preference to pathogenicity. Here we present inStrain, a program that utilizes metagenomic paired reads to profile intra-population genetic diversity (microdiversity) across whole genomes and compare populations in a microdiversity-aware manner, dramatically increasing genomic comparison accuracy when benchmarked against existing methods. We use inStrain to profile >1,000 fecal metagenomes from newborn premature infants and find that siblings share significantly more strains than unrelated infants, although identical twins share no more strains than fraternal siblings. Infants born via cesarean section harbored Klebsiella with significantly higher nucleotide diversity than infants delivered vaginally, potentially reflecting acquisition from hospital versus maternal microbiomes. Genomic loci showing diversity within an infant included variants found in other infants, possibly reflecting inoculation from diverse hospital-associated sources. InStrain can be applied to any metagenomic dataset for microdiversity analysis and rigorous strain comparison.
What problem does this paper attempt to address?