Clade-specific long-read sequencing increases the accuracy and specificity of the gyrB phylogenetic marker gene

Robert G Nichols,Emily R Davenport
DOI: https://doi.org/10.1101/2024.07.15.603574
2024-07-17
Abstract:Phylogenetic marker gene sequencing is often used as a quick and cost-effective way of evaluating microbial composition within a community. While 16S rRNA gene sequencing (16S) is commonly used for bacteria and archaea, other marker genes are preferable in certain situations, such as when 16S sequences cannot distinguish between taxa within a group. Another situation is when researchers want to study cospeciation of host taxa that diverged much more recently than the slowly evolving 16S rRNA gene. For example, the bacterial gyrase subunit B (gyrB) gene has been used to investigate cospeciation between the microbiome and various hominid species. However, to date only primers that generate short-read Illumina MiSeq-length amplicons exist to investigate gyrB of the Bacteroidaceae, Bifidobacteriaceae, and Lachnospiraceae families. Here, we update this methodology by creating gyrB primers for the Bacteroidaceae, Bifidobacteriaceae, and Lachnospiraceae families for long-read PacBio sequencing and characterize them against established short-read gyrB primer sets. We demonstrate both bioinformatically and analytically that these longer amplicons offer more sequence space for greater taxonomic resolution, lower off-target amplification rates, and lower error rates with PacBio CCS sequencing versus established short-read sequencing. The availability of these long-read gyrB primers will prove to be integral to the continued analysis of cospeciation between bacterial members of the gut microbiome and recently diverging host species.
Genomics
What problem does this paper attempt to address?