Accelerating genomic workflows using NVIDIA Parabricks
Kyle A. O’Connell,Zelaikha B. Yosufzai,Ross A. Campbell,Collin J. Lobb,Haley T. Engelken,Laura M. Gorrell,Thad B. Carlson,Josh J. Catana,Dina Mikdadi,Vivien R. Bonazzi,Juergen A. Klenk
DOI: https://doi.org/10.1186/s12859-023-05292-2
IF: 3.307
2023-06-02
BMC Bioinformatics
Abstract:As genome sequencing becomes better integrated into scientific research, government policy, and personalized medicine, the primary challenge for researchers is shifting from generating raw data to analyzing these vast datasets. Although much work has been done to reduce compute times using various configurations of traditional CPU computing infrastructures, Graphics Processing Units (GPUs) offer opportunities to accelerate genomic workflows by orders of magnitude. Here we benchmark one GPU-accelerated software suite called NVIDIA Parabricks on Amazon Web Services (AWS), Google Cloud Platform (GCP), and an NVIDIA DGX cluster. We benchmarked six variant calling pipelines, including two germline callers (HaplotypeCaller and DeepVariant) and four somatic callers (Mutect2, Muse, LoFreq, SomaticSniper).
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology