Fast and Accurate Classification of Meta-Genomics Long Reads with Desamba

Gaoyang Li,Yongzhuang Liu,Deying Li,Bo Liu,Junyi Li,Yang Hu,Yadong Wang
DOI: https://doi.org/10.3389/fcell.2021.643645
IF: 5.5
2021-01-01
Frontiers in Cell and Developmental Biology
Abstract:There is still a lack of fast and accurate classification tools to identify the taxonomies of noisy long reads, which is a bottleneck to the use of the promising long-read metagenomic sequencing technologies. Herein, we propose de Bruijn graph-based Sparse Approximate Match Block Analyzer (deSAMBA), a tailored long-read classification approach that uses a novel pseudo alignment algorithm based on sparse approximate match block (SAMB). Benchmarks on real sequencing datasets demonstrate that deSAMBA enables to achieve high yields and fast speed simultaneously, which outperforms state-of-the-art tools and has many potentials to cutting-edge metagenomics studies.
What problem does this paper attempt to address?