Bin Chicken: targeted metagenomic coassembly for the efficient recovery of novel genomes

Samuel T. N. Aroney,Rhys J. P. Newell,Gene W. Tyson,Ben J. Woodcroft
DOI: https://doi.org/10.1101/2024.11.24.625082
2024-11-25
Abstract:Recovery of microbial genomes from metagenomic datasets has provided genomic representation for hundreds of thousands of species from diverse biomes. However, low abundance microorganisms are often missed due to insufficient genomic coverage. Here we present Bin Chicken, an algorithm which substantially improves genome recovery through automated, targeted selection of metagenomes for coassembly based on shared marker gene sequences derived from raw reads. Marker gene sequences that are divergent from known reference genomes can be further prioritised, providing an efficient means of recovering highly novel genomes. Applying Bin Chicken to public metagenomes and coassembling 800 sample-groups recovered 77,562 microbial genomes, including the first genomic representatives of 6 phyla, 41 classes, and 24,028 species. These genomes expand the genomic tree of life and uncover a wealth of novel microbial lineages for further research.
Microbiology
What problem does this paper attempt to address?