Tuning parameters for polygenic risk score methods using GWAS summary statistics from training data

Wei Jiang,Ling Chen,Matthew J. Girgenti,Hongyu Zhao
DOI: https://doi.org/10.1038/s41467-023-44009-0
IF: 16.6
2024-01-02
Nature Communications
Abstract:Abstract Various polygenic risk scores (PRS) methods have been proposed to combine the estimated effects of single nucleotide polymorphisms (SNPs) to predict genetic risks for common diseases, using data collected from genome-wide association studies (GWAS). Some methods require external individual-level GWAS dataset for parameter tuning, posing privacy and security-related concerns. Leaving out partial data for parameter tuning can also reduce model prediction accuracy. In this article, we propose PRStuning, a method that tunes parameters for different PRS methods using GWAS summary statistics from the training data. PRStuning predicts the PRS performance with different parameters, and then selects the best-performing parameters. Because directly using training data effects tends to overestimate the performance in the testing data, we adopt an empirical Bayes approach to shrinking the predicted performance in accordance with the genetic architecture of the disease. Extensive simulations and real data applications demonstrate PRStuning’s accuracy across PRS methods and parameters.
multidisciplinary sciences
What problem does this paper attempt to address?