RePS: a sequence assembler that masks exact repeats identified from the shotgun data.

Jun Wang,Gane Ka-Shu Wong,Peixiang Ni,Yujun Han,Xiangang Huang,Jianguo Zhang,Chen Ye,Yong Zhang,Jianfei Hu,Kunlin Zhang,Xin Xu,Lijuan Cong,Hong Lu,Xide Ren,Xiaoyu Ren,Jun He,Lin Tao,Douglas A Passey,Jian Wang,Huanming Yang,Jun Yu,Songgang Li
DOI: https://doi.org/10.1101/gr.165102
IF: 9.438
2002-01-01
Genome Research
Abstract:We describe a sequence assembler, RePS (repeat-masked Phrap with scaffolding), that explicitly identifies exact 20mer repeats from the shotgun data and removes them prior to the assembly. The established software is used to compute meaningful error probabilities for each base. Clone-end-pairing information is used to construct scaffolds that order and orient the contigs. We show with real data for human and rice that reasonable assemblies are possible even at coverages of only 4x to 6x, despite having up to 42.2% in exact repeats.
What problem does this paper attempt to address?