Parallel Optimization of Variation Detection Algorithms for Large-Scale Genome Data

CUI Yingbo,HUANG Chun,TANG Tao,YANG Canqun,LIAO Xiangke,PENG Shaoliang
DOI: https://doi.org/10.11959/j.issn.2096-0271.2020041
IF: 3.3
2020-01-01
Big Data Research
Abstract:Sequence alignment and mutation detection are the basic steps of genomic data analysis.They are the premise of subsequent functional analysis,and the most time-consuming steps.In order to effectively deal with the massive genomic big data brought by high-throughput sequencing technology,MPI,OpenMP and other technologies to perform multi-level parallel optimization of sequence alignment algorithm and SNP detection algorithm were used.By testing on different data sets and parallel scales,the core algorithm reached more than 9x speedup,and the parallel efficiency remained above 60% in large-scale test.The improved algorithms obtain good parallel performance and scalability,that effectively improves the ability of genomic big data mutation detection.
What problem does this paper attempt to address?