Accelerating Large-Scale Biological Database Search on Xeon Phi-based Neo-Heterogeneous Architectures

Haidong Lan,Weiguo Liu,Bertil Schmidt,Bingqiang Wang
DOI: https://doi.org/10.1109/bibm.2015.7359735
2015-01-01
Abstract:In this paper we present new parallelization techniques for searching large-scale biological sequence databases with the Smith-Waterman algorithm on Xeon Phi-based neoheterogenous architectures. In order to make full use of the compute power of both the multi-core CPU and the many-core Xeon Phi hardware, we use a collaborative computing scheme as well as hybrid parallelism. At the CPU side, we employ SSE intrinsics and multi-threading to implement SIMD parallelism. At the Xeon Phi side, we use Knights Corner vector instructions to gain more data parallelism. We have presented two dynamic task distribution schemes (thread level and device level) in order to achieve better load balancing. Furthermore, a multi-threaded asynchronous scheme is used to overlap communication and computation between CPUs and Xeon Phis. Evaluations on real protein sequence databases show that our method achieves a peak overall performance up to 220 GCUPS on a neo-heterogeneous platform consisting of two Intel E5-2620 CPUs and two Intel Xeon Phi 7110P cards. It also exhibits good scalability in terms of database size and query length. Our implementation is available at: http://turbo0628.github.io/LSBDS/.
What problem does this paper attempt to address?