Abstract:Biological Sequence alignment is a fundamental application in bioinformatics. It can be used to identify functionally conserved sequences and find evolutionary relationships between species. To compare entire genomes from different species, biologists increasingly need alignment methods that are efficient enough to handle long sequences, and accurate enough to correctly align the conserved biological features between distant species. Global alignments are important because they reveal the shared order of biological features in the compared species, and produce a more accurate alignment at the base-pair level when the features are in the same order. The best known global alignment algorithm is Needleman-Wunsch, later, BitPAl, a bit parallel algorithm for general, integer scoring global algorithm, provides a new implementation of Needleman-Wunsch algorithm (BitNW). Compared with original Needleman-Wunsch algorithm, BitNW is significantly faster by exploiting bit parallelism. A number of parallel strategies have been proposed to accelerate exact alignment methods. However, most of them failed to align long biological sequences due to quadratic time complexity. In this paper, we propose SLPal, a fast bit-parallel algorithm for accelerating long DNA sequence comparison on Intel many-core and multi-core architectures. In order to fully exploit the computing power of many cores and the 512-bit vector processing units (VPUs), we use a two-level parallelism scheme: coarse-grained thread level and fine-grained VPU level approaches. In thread level, the alignment scoring matrix will be split into small tiles and multiple threads will process these small tiles currently by using Intel TBB library. In the VPU level, the computing kernels are implemented using the Single Instruction Multiple Data (SIMD) instructions, thus, 16 independent integers reside in a 512-bit vector register can be processed simultaneously. The evaluation reveals that our algorithm achieves a stable performance for all benchmark data and yields a performance of up to 511.7 (617.2) GCUPS on a server with single Xeon Phi 7210 processor (dual Xeon Gold 6148 20-core processors). Furthermore, our test shows that SLPal can align two sequences with about 5 million bps in 50 seconds on our server equipped with dual Xeon Gold 6148 CPUs.

Parallel Algorithms for Large-Scale Biological Sequence Alignment on Xeon-Phi Based Clusters

Gene Sequence Alignment on a Public Computing Platform

Accelerating Large-Scale Biological Database Search on Xeon Phi-based Neo-Heterogeneous Architectures

A data parallel strategy for aligning multiple biological sequences on multi-core computers

SLPal: Accelerating Long Sequence Alignment on Many-Core and Multi-Core Architectures

A Data Parallel Strategy for Aligning Multiple Biological Sequences on Homogeneous Multiprocessor Platform

Parallel Multiple Sequences Alignment in SMP Cluster

XLCS - A New Bit-Parallel Longest Common Subsequence Algorithm on Xeon Phi Clusters.

Parallel Local Alignment Algorithm for Multiple Sequences on Heterogeneous Cluster Systems

GPU Accelerated Biological Sequence Alignment

XSW: Accelerating Biological Database Search on Xeon Phi

A high-throughput gene sequence alignment strategy using parallel computing

Mega-base Biological Sequence Alignment Targeting OpenCL Architecture

Parallel linear space algorithm for large-scale sequence alignment

Parallel Divide and Conquer Bio-Sequence Comparison Based on Smith-Waterman Algorithm

Cluster-Distribute-Align-Merge: A General Algorithm to Speed Up Multiple Sequence Alignment on Multi-Core Computers

MyPhi: Efficient Levenshtein Distance Computation on Xeon Phi Based Architectures

ClustalXeed: a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment

HAlign-II: Efficient Ultra-Large Multiple Sequence Alignment and Phylogenetic Tree Reconstruction with Distributed and Parallel Computing

Distributed Sequence Alignment Applications for the Public Computing Architecture

Parallel Algorithm for Multiple Genome Alignment on the Grid Environment