Heuristic Clustering Method Based on Neighbor-Seeds for 454 Sequencing Data

Wei CHEN,Yong-Mei CHENG,Shao-Wu ZHANG,Quan PAN
DOI: https://doi.org/10.13328/j.cnki.jos.004547
2014-01-01
Abstract:With the development of next-generation sequencing technology, a large number of 16S rRNA gene reads have been collected. A key and important issue is to develop novel methods for mining the hidden information among those data. Sequence clustering aims to find the natural groups of large-scale data which can help us to understand the species, functional and structural diversity of microbial communities. This present work proposes a heuristic clustering method based on Neighbor-seeds, named NbHClust, for 454 sequencing data. The results show that this method can reduce extent of overestimation of operational taxonomy unit (OTU) and have a good robust and high clustering accuracy.
What problem does this paper attempt to address?