16S-FASAS: an integrated pipeline for synthetic full-length 16S rRNA gene sequencing data analysis

Ke Zhang,Rongnan Lin,Yujun Chang,Qing Zhou,Zhi Zhang
DOI: https://doi.org/10.7717/peerj.14043
IF: 3.061
2022-09-23
PeerJ
Abstract:Background The full-length 16S rRNA sequencing can better improve the taxonomic and phylogenetic resolution compared to the partial 16S rRNA gene sequencing. The 16S-FAS-NGS (16S rRNA full-length amplicon sequencing based on a next-generation sequencing platform) technology can generate high-quality, full-length 16S rRNA gene sequences using short-read sequencers, together with assembly procedures. However there is a lack of a data analysis suite that can help process and analyze the synthetic long read data. Results Herein, we developed software named 16S-FASAS (16S full-length amplicon sequencing data analysis software) for 16S-FAS-NGS data analysis, which provided high-fidelity species-level microbiome data. 16S-FASAS consists of data quality control, de novo assembly, annotation, and visualization modules. We verified the performance of 16S-FASAS on both mock and fecal samples. In mock communities, we proved that taxonomy assignment by MegaBLAST had fewer misclassifications and tended to find more low abundance species than the USEARCH-UNOISE3-based classifier, resulting in species-level classification of 85.71% (6/7), 85.71% (6/7), 72.72% (8/11), and 70% (7/10) of the target bacteria. When applied to fecal samples, we found that the 16S-FAS-NGS datasets generated contigs grouped into 60 and 56 species, from which 71.62% (43/60) and 76.79% (43/56) were shared with the Pacbio datasets. Conclusions 16S-FASAS is a valuable tool that helps researchers process and interpret the results of full-length 16S rRNA gene sequencing. Depending on the full-length amplicon sequencing technology, the 16S-FASAS pipeline enables a more accurate report on the bacterial complexity of microbiome samples. 16S-FASAS is freely available for use at https://github.com/capitalbio-bioinfo/FASAS .
multidisciplinary sciences
What problem does this paper attempt to address?