A Parallel Pairwise Alignment with Pruning for Large Genomic Sequences

Xiangyuan Zhu,Bing Li,Kenli Li,Ping Shao,Yi Pan
DOI: https://doi.org/10.1109/PDCAT.2017.00047
2017-01-01
Abstract:Pairwise sequence alignment is a common and fundamental task in Computational Biology, which constitutes the basis for many Bioinformatics applications. In the post-genomic era, there is an increasing demand to align long DNA sequences to discover their functions. In this paper, we propose a parallel pairwise alignment algorithm for large genomic sequences by recursively dividing the whole genomic sequences into small pieces, with an effective pruning strategy to reduce search and computation space. We implemented rigorous tests on a 4-core computer using real genomic sequences and artificially generated sequences. The results show that our implementation can achieve speedup 10.64 with 99.75% accuracy compared to the sequential algorithm. As far as we know, this is the first time that MBP (mega base-pairs) sequences are globally aligned with an affine gap penalty.
What problem does this paper attempt to address?