Improved polygenic prediction by Bayesian multiple regression on summary statistics

Luke R. Lloyd-Jones,Jian Zeng,Julia Sidorenko,Loïc Yengo,Gerhard Moser,Kathryn E. Kemper,Huanwei Wang,Zhili Zheng,Reedik Magi,Tõnu Esko,Andres Metspalu,Naomi R. Wray,Michael E. Goddard,Jian Yang,Peter M. Visscher
DOI: https://doi.org/10.1038/s41467-019-12653-0
IF: 16.6
2019-11-08
Nature Communications
Abstract:Abstract Accurate prediction of an individual’s phenotype from their DNA sequence is one of the great promises of genomics and precision medicine. We extend a powerful individual-level data Bayesian multiple regression model (BayesR) to one that utilises summary statistics from genome-wide association studies (GWAS), SBayesR. In simulation and cross-validation using 12 real traits and 1.1 million variants on 350,000 individuals from the UK Biobank, SBayesR improves prediction accuracy relative to commonly used state-of-the-art summary statistics methods at a fraction of the computational resources. Furthermore, using summary statistics for variants from the largest GWAS meta-analysis ( n ≈ 700, 000) on height and BMI, we show that on average across traits and two independent data sets that SBayesR improves prediction R 2 by 5.2% relative to LDpred and by 26.5% relative to clumping and p value thresholding.
multidisciplinary sciences
What problem does this paper attempt to address?