SMeta, a Binning Tool Using Single-Cell Sequences to Aid Reconstructing Metageome Species Accurately

Yuhao Zhang,Mingyue Cheng,Kang Ning
DOI: https://doi.org/10.1101/2024.08.25.609542
2024-01-01
Abstract:Because of the large volume and complex structure of metagenomic data, traditional binning methods are often hard to classify microbial metagenomes effectively. To deal with these challenges, introducing longer and more accurate single-cell sequencing data is a possible solution. Inspired by the existing MetaBAT2 tool, this study develops a new vector-based binning algorithm, SMeta, which uses both metagenomic and single-cell sequencing data. SMeta is specifically designed for eukaryotic microbial metagenomes, with the long reads characteristic of single-cell data. By introducing the segment tree data structure, the algorithm aligns long single-cell sequences with short metagenomic sequences quickly. This approach allows for the use of reference genomes from genomic databases to replace single-cell data, which makes more precise identification and reconstruction possible for small genome fragments, which are typically overlooked by traditional methods. Also, it might provide a higher purity sequence set for subsequent assemblies. ### Competing Interest Statement The authors have declared no competing interest.
What problem does this paper attempt to address?