MECAT: an ultra-fast mapping, error correction and de novo assembly tool for single-molecule sequencing reads

Chuan-Le Xiao,Ying Chen,Shang-qian Xie,Kai-Ning Chen,Yan Wang,Feng Luo,Zhi Xie
DOI: https://doi.org/10.1101/089250
IF: 48
2016-01-01
Nature Methods
Abstract:The high computational cost of current assembly methods for the long, noisy single molecular sequencing (SMS) reads has prevented them from assembling large genomes. We introduce an ultra-fast alignment method based on a novel global alignment score. For large human SMS data, our method is 7X faster than MHAP for pairwise alignment and 15X faster than BLASR for reference mapping. We develop a Mapping, Error Correction and de novo Assembly Tool (MECAT) by integrating our new alignment and error correction methods, with the Celera Assembler. MECAT is capable of producing high quality de novo assembly of large genome from SMS reads with low computational cost. MECAT produces reference-quality assemblies of Saccharomyces cerevisiae , Arabidopsis thaliana , Drosophila melanogaster and reconstructs the human CHM1 genome with 15% longer NG50 in only 7600 CPU core hours using 54X SMS reads and a Chinese Han genome in 19200 CPU core hours using 102X SMS reads.
What problem does this paper attempt to address?