A high-performance computational workflow to accelerate GATK SNP detection across a 25-genome dataset

Yong Zhou,Nagarajan Kathiresan,Zhichao Yu,Luis F. Rivera,Yujian Yang,Manjula Thimma,Keerthana Manickam,Dmytro Chebotarov,Ramil Mauleon,Kapeel Chougule,Sharon Wei,Tingting Gao,Carl D. Green,Andrea Zuccolo,Weibo Xie,Doreen Ware,Jianwei Zhang,Kenneth L. McNally,Rod A. Wing
DOI: https://doi.org/10.1186/s12915-024-01820-5
IF: 7.364
2024-01-26
BMC Biology
Abstract:Single-nucleotide polymorphisms (SNPs) are the most widely used form of molecular genetic variation studies. As reference genomes and resequencing data sets expand exponentially, tools must be in place to call SNPs at a similar pace. The genome analysis toolkit (GATK) is one of the most widely used SNP calling software tools publicly available, but unfortunately, high-performance computing versions of this tool have yet to become widely available and affordable.
biology
What problem does this paper attempt to address?