MECAT: an ultra-fast mapping, error correction and<i>de novo</i>assembly tool for single-molecule sequencing reads

Chuan-Le Xiao,Ying Chen,Shang-qian Xie,Kai-Ning Chen,Yan Wang,Feng Luo,Zhi Xie
DOI: https://doi.org/10.1101/089250
2016-01-01
Abstract:The high computational cost of current assembly methods for the long, noisy single molecular sequencing (SMS) reads has prevented them from assembling large genomes. We introduce an ultra-fast alignment method based on a novel global alignment score. For large human SMS data, our method is 7X faster than MHAP for pairwise alignment and 15X faster than BLASR for reference mapping. We develop a Mapping, Error Correction and de novo Assembly Tool (MECAT) by integrating our new alignment and error correction methods, with the Celera Assembler. MECAT is capable of producing high qualityde novoassembly of large genome from SMS reads with low computational cost. MECAT produces reference-quality assemblies ofSaccharomyces cerevisiae,Arabidopsis thaliana,Drosophila melanogasterand reconstructs the human CHM1 genome with 15% longer NG50 in only 7600 CPU core hours using 54X SMS reads and a Chinese Han genome in 19200 CPU core hours using 102X SMS reads.
What problem does this paper attempt to address?