Accelerating Gene Clustering on Heterogeneous Clusters

Jian-wen WEI,Zhi-geng XU,Bing-qiang WANG,SEE Simon,LIN James
DOI: https://doi.org/10.11896/j.issn.1002-137X.2017.03.005
2017-01-01
Computer Science
Abstract:Metagenome clustering is a novel approach to detect flaw genes which relies on massive gene data,effective clustering algorithms and efficient implementation.In clustering,calculating correlation matrix is essential,accounting most of computing time.To take a gene repo as an example,which has 1300 samples and million genes,it will take about 27 years to cluster them.Therefore,developing efficient implementations for calculating correlation matrix is most essential.After analyzing the algorithms,we proposed and took several optimization approaches.First,we implemented an efficient multithread one using OpenMP dynamic scheduling.Secondly,we further improved the pedormance by utilizing cache on CPU and shared memory on GPU efficiently.Thirdly,we implemented a loadbalance work distribution which works well on the MPI program on CPU.Compared to the unoptimized single-threaded CPU program,the two fasted one,MPI+OpenMP on 256 CPU cores and MPI+CUDA on 6 GPU cards,achieve 238.8 and 263.8 speedups.
What problem does this paper attempt to address?