A data parallel strategy for aligning multiple biological sequences on multi-core computers

Xiangyuan Zhu,Kenli Li,Ahmad Salah
DOI: https://doi.org/10.1016/j.compbiomed.2012.12.009
IF: 7.7
2013-01-01
Computers in Biology and Medicine
Abstract:In this paper, we address the large-scale biological sequence alignment problem, which has an increasing demand in computational biology. We employ data parallelism paradigm that is suitable for handling large-scale processing on multi-core computers to achieve a high degree of parallelism. Using the data parallelism paradigm, we propose a general strategy which can be used to speed up any multiple sequence alignment method. We applied five different clustering algorithms in our strategy and implemented rigorous tests on an 8-core computer using four traditional benchmarks and artificially generated sequences. The results show that our multi-core-based implementations can achieve up to 151-fold improvements in execution time while losing 2.19% accuracy on average. The source code of the proposed strategy, together with the test sets used in our analysis, is available on request.
What problem does this paper attempt to address?