XPXP: improving polygenic prediction by cross-population and cross-phenotype analysis

Jiashun Xiao,Mingxuan Cai,Xianghong Hu,Xiang Wan,Gang Chen,Can Yang
DOI: https://doi.org/10.1093/bioinformatics/btac029
IF: 5.8
2022-01-18
Bioinformatics
Abstract:Abstract Motivation As increasing sample sizes from genome-wide association studies (GWASs), polygenic risk scores (PRSs) have shown great potential in personalized medicine with disease risk prediction, prevention and treatment. However, the PRS constructed using European samples becomes less accurate when it is applied to individuals from non-European populations. It is an urgent task to improve the accuracy of PRSs in under-represented populations, such as African populations and East Asian populations. Results In this article, we propose a cross-population and cross-phenotype (XPXP) method for construction of PRSs in under-represented populations. XPXP can construct accurate PRSs by leveraging biobank-scale datasets in European populations and multiple GWASs of genetically correlated phenotypes. XPXP also allows to incorporate population-specific and phenotype-specific effects, and thus further improves the accuracy of PRS. Through comprehensive simulation studies and real data analysis, we demonstrated that our XPXP outperformed existing PRS approaches. We showed that the height PRSs constructed by XPXP achieved 9% and 18% improvement over the runner-up method in terms of predicted R2 in East Asian and African populations, respectively. We also showed that XPXP substantially improved the stratification ability in identifying individuals at high genetic risk of type 2 diabetes. Availability and implementation The XPXP software and all analysis code are available at github.com/YangLabHKUST/XPXP. Supplementary information Supplementary data are available at Bioinformatics online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?