An efficient algorithm for DNA fragment assembly in MapReduce.

Baomin Xu,Jin Gao,Chunyan Li
DOI: https://doi.org/10.1016/j.bbrc.2012.08.101
IF: 3.1
2012-01-01
Biochemical and Biophysical Research Communications
Abstract:Fragment assembly is one of the most important problems of sequence assembly. Algorithms for DNA fragment assembly using de Bruijn graph have been widely used. These algorithms require a large amount of memory and running time to build the de Bruijn graph. Another drawback of the conventional de Bruijn approach is the loss of information. To overcome these shortcomings, this paper proposes a parallel strategy to construct de Bruijin graph. Its main characteristic is to avoid the division of de Bruijin graph. A novel fragment assembly algorithm based on our parallel strategy is implemented in the MapReduce framework. The experimental results show that the parallel strategy can effectively improve the computational efficiency and remove the memory limitations of the assembly algorithm based on Euler superpath. This paper provides a useful attempt to the assembly of large-scale genome sequence using Cloud Computing.
What problem does this paper attempt to address?