Large-Scale DNA Sequence Assembly by Using Computing Grid

Xiaoyong Fang,Zhigang Luo,Zhenghua Wang,Fan Ding
DOI: https://doi.org/10.1109/GCCW.2006.59
2006-01-01
Abstract:DNA sequence assembly is a fundamental part of biological computing. However, most of the largescale sequence assemblies require intensive computing power and huge storage. To speed up the assembly process, we here propose a method for large-scale DNA sequence assembly by using computing grid. The central idea of our method is to first cluster the input of fragment set into many non-intersected subsets using k-mers and then to distribute them to all nodes of the grid-computing system. Our method has accuracy of more than 92% on the test data sets under the simulated grid-computing system but costing shorter time and lower storage. Our method can efficiently process large-scale DNA sequence assembly by taking advantage of huge storage and computing capacity of computing gird.
What problem does this paper attempt to address?