Highly accurate metagenome-assembled genomes from human gut microbiota using long-read assembly, binning, and consolidation methods
Daniel M Portik,Xiaowen Feng,Gaetan Benoit,Daniel J Nasko,Benjamin Auch,Samuel J Bryson,Raul Cano,Martha Carlin,Anabelle Damerum,Brett Farthing,Jonas R Grove,Moutusee Islam,Kyle W Langford,Ivan Liachko,Kristopher Locken,Hayley Mangelson,Shuiquan Tang,Siyuan Zhang,Christopher Quince,Jeremy E Wilkinson
DOI: https://doi.org/10.1101/2024.05.10.593587
2024-05-11
Abstract:Long-read metagenomic sequencing is a powerful approach for cataloging the microbial diversity present in complex microbiomes, including the human gut microbiome. We performed a deep-sequencing experiment using PacBio HiFi reads to obtain metagenome-assembled genomes (MAGs) from a pooled human gut microbiome. We performed long-read metagenome assembly using two methods (hifiasm-meta, metMDBG), used improved bioinformatic and proximity ligation binning strategies to cluster contigs and identify MAGs, and developed a novel framework to compare and consolidate MAGs (pb-MAG-mirror). We found proximity ligation binning yielded more MAGs than bioinformatic binning, but our novel comparison framework resulted in higher MAG yields than either binning strategy individually. In total, from 255 Gbp of total HiFi data we produced 595 total MAGs (including 175 high-quality MAGs) using hifiasm-meta, and 547 total MAGs (including 277 high-quality MAGs) with metaMDBG. Hifiasm-meta assembled almost twice as many strain-level MAGs as metaMDBG (246 vs. 156), but both assembly methods produced up to five strains for a species. Approximately 85% of the MAGs were assigned to known species, but we recovered >35 high-quality MAGs that represent uncultured diversity. Based on strict similarity scores, we found 125 MAGs were unequivocally shared across the assembly methods at the strain level, representing ~22% of the total MAGs recovered per method. Finally, we detected more total viral sequences in the metaMDBG assembly versus the hifiasm-meta assembly (~6,700 vs. ~4,500). Overall, we find the use of HiFi sequencing, improved metagenome assembly methods, and complementary binning strategies is highly effective for rapidly cataloging microbial genomes in complex microbiomes.
Microbiology