Retrieving Myers-Miller Alignments For Pairwise Biological Sequences Using Spark

Xiangyuan Zhu,Bing Li,Jian Li,Kenli Li
DOI: https://doi.org/10.1109/fskd.2017.8393085
2017-01-01
Abstract:The Myers-Miller algorithm is a widely used global alignment tool in quadratic time and linear space in computational biology. Because of the huge time consumption, it is unfeasible to aligning megabase sequences by using the Myers-Miller tool. However, cloud computing is a promising platform to achieve the alignment results for megabase sequences in feasible time. In this paper, we present Cloud Myers-Miller, a parallel algorithm that construct huge sequence alignment in cloud. Cloud Myers-Miller is divided into three stages, which are preparation, parallel processing, and collection stage. Our results on an five-machine cluster show high speed-up for long real DNA. It is possible to align more than 543 KBP (kilo base-pairs) DNA by using Cloud Myers-Miller.
What problem does this paper attempt to address?