Abstract:ABSTRACT A longstanding challenge in human microbiome research is achieving the taxonomic and functional resolution needed to generate testable hypotheses about the gut microbiome’s impact on health and disease. More recently, this challenge has extended to a need for in-depth understanding of the pharmacokinetics and pharmacodynamics of clinical microbiome-based interventions. Whole genome metagenomic sequencing provides high taxonomic resolution and information on metagenome functional capacity, but the required deep sequencing is costly. For this reason, short-read sequencing of the bacterial 16S ribosomal RNA (rRNA) gene is the standard for microbiota profiling, despite its poor taxonomic resolution. The recent falling costs and improved fidelity of long-read sequencing warrant an evaluation of this approach for clinical microbiome analysis. We used samples from participants enrolled in a Phase 1b clinical trial of a novel live biotherapeutic product to perform a comparative analysis of short-read and long-read amplicon and metagenomic sequencing approaches to assess their value for generating informative and actionable clinical microbiome data. Comparison of ubiquitous short-read 16S rRNA amplicon profiling to long-read profiling of the 16S-ITS-23S rRNA amplicon showed that only the latter provided strain-level community resolution and insight into novel taxa. Across all methods, overall community taxonomic profiles were comparable and relationships between samples were conserved, highlighting the accuracy of modern microbiome analysis pipelines. All methods identified an active ingredient strain in treated study participants, though detection confidence was higher for long-read methods. Read coverage from both metagenomic methods provided evidence of active ingredient strain replication in some treated participants. Compared to short-read metagenomics, approximately twice the proportion of long reads were assigned functional annotations (63% vs. 34%). Finally, similar bacterial metagenome-assembled genomes (MAGs) were recovered across short-read and long-read metagenomic methods, although MAGs recovered from long reads were more complete. Overall, despite higher costs, long-read microbiome characterization provides added scientific value for clinical microbiome research in the form of higher taxonomic and functional resolution and improved recovery of microbial genomes compared to traditional short-read methodologies. Data Summary All supporting data, code and protocols have been provided within the article or as supplementary data files. Two supplementary figures and four supplementary tables are available with the online version of this article. Sequencing data are accessible in the National Center for Biotechnology Information (NCBI) database under BioProject accession number PRJNA754443. The R code and additional data files used for analysis and figure generation are accessible in a GitHub repository ( https://github.com/jeanette-gehrig/Gehrig_et_al_sequencing_comparison ). Impact Statement Accurate sequencing and analysis are essential for informative microbiome profiling, which is critical for the development of novel microbiome-targeted therapeutics. Recent improvements in long-read sequencing technology provide a promising, but more costly, alternative to ubiquitous short-read sequencing. To our knowledge, a direct comparison of the informational value of short-read and HiFi long-read sequencing approaches has not been reported for clinical microbiome samples. Using samples from participants in a Phase 1b trial of a live biotherapeutic product, we compare microbiome profiles generated from short-read and long-read sequencing for both amplicon-based 16S ribosomal RNA profiling and metagenomic sequencing. Though overall taxonomic profiles were similar across methods, only long-read amplicon sequencing provided strain-level resolution, and long-read metagenomic sequencing resulted in a significantly greater proportion of functionally annotated genes. Detection of a live biotherapeutic active ingredient strain in treated participants was achieved with all methods, and both metagenomic methods provided evidence of active replication of this strain in some participants. Similar taxonomies were recovered through metagenomic assemblies of short and long reads, although assemblies were more complete with long reads. Overall, we show the utility of long-read microbiome sequencing in direct comparison to commonly used short-read methods for clinically relevant microbiome profiling.

Improving bacterial metagenomic research through long read sequencing

Unveiling microbial diversity: harnessing long-read sequencing technology

A comparison of short-read, HiFi long-read, and hybrid strategies for genome-resolved metagenomics

Perspectives and benefits of high-throughput long-read sequencing in microbial ecology

Advancing metagenome-assembled genome-based pathogen identification: unraveling the power of long-read assembly algorithms in Oxford Nanopore sequencing

Finding the right fit: A comprehensive evaluation of short-read and long-read sequencing approaches to maximize the utility of clinical microbiome data

A comprehensive investigation of metagenome assembly by linked-read sequencing

Finding the right fit: evaluation of short-read and long-read sequencing approaches to maximize the utility of clinical microbiome data

Concatenation of paired-end reads improves taxonomic classification of amplicons for profiling microbial communities

The impact of sequencing depth on the inferred taxonomic composition and AMR gene content of metagenomic samples

Analysis of the microbiome: Advantages of whole genome shotgun versus 16S amplicon sequencing

The effect of taxonomic classification by full-length 16S rRNA sequencing with a synthetic long-read technology

Ultra-deep, long-read nanopore sequencing of mock microbial community standards

Improved Microbial Community Characterization of 16S Rrna Via Metagenome Hybridization Capture Enrichment

High-resolution microbial community reconstruction by integrating short reads from multiple 16S rRNA regions

Whole genome sequencing approaches for taxonomic profiling and evaluation of wastewater quality

Metagenomic profiling pipelines improve taxonomic classification for 16S amplicon sequencing data

Evaluation of metagenomic, 16S rRNA gene and ultra-plexed PCR-based sequencing approaches for profiling antimicrobial resistance gene and bacterial taxonomic composition of polymicrobial samples

A multi-amplicon 16S rRNA sequencing and analysis method for improved taxonomic profiling of bacterial communities

Evaluation of 16S rRNA gene sequencing for species and strain-level microbiome analysis

An integrated strain-level analytic pipeline utilizing longitudinal metagenomic data