Abstract:Background Polygenic risk scores (PRS) have ushered in a new era in genetic epidemiology, offering insights into individual predispositions to a wide range of diseases. However, despite recent marked enhancements in their predictive power, there are still challenges that need to be overcome before PRS-based models can be broadly applied in the clinic, including sufficient accuracy, easy interpretability and portability across diverse populations. Methods Leveraging trans-ancestry genome-wide association study (GWAS) meta-analysis, we generated novel, diverse summary statistics for 30 medically-related traits which were used to benchmark the performance of six existing PRS algorithms using UK biobank. Observing that SBayesRC had the best overall performance but recognizing strengths in each method, we developed an ensemble PRS model using logistic regression to combine outputs from top-performing algorithms. This ensemble model was validated on the diverse eMERGE and PAGE MEC cohorts, and the performance was compared against current state-of-the-art PRS models. To enhance predictive accuracy for clinical application, we incorporated easily-accessible clinical characteristics such as age, gender, ancestry and risk factors, creating disease prediction models intended as prospective diagnostic tests, with easily interpretable positive or negative outcomes. Results Predictive performance of PRS models improved with trans-ancestry GWAS meta-analysis and was further enhanced by the ensemble model, which surpassed state-of-art PRS models. When applied to external cohorts, performance drops were minimal, indicating good calibration. After adding clinical characteristics, 12 out of 30 models surpassed 80% AUC. Further, 25 traits exceeded the diagnostic odds ratio (DOR) of 5 and 19 traits exceeded DOR of 10 for all ancestry groups, indicating high predictive value. The highest DOR in a population with a sufficient number of cases was 66.2 for Alzheimer's disease in Europeans. Our PRS model for coronary artery disease identified 55-80 times more true coronary events than rare pathogenic variant models, reinforcing its clinical potential. The polygenic component modulated the effect of high-risk rare variants, stressing the need to consider all genetic components in clinical settings. Conclusions Newly developed PRS-based disease prediction models have sufficient accuracy and portability to warrant consideration of being used in the clinic.

XPXP: improving polygenic prediction by cross-population and cross-phenotype analysis

A Unified Framework for Cross-Population Trait Prediction by Leveraging the Genetic Correlation of Polygenic Traits.

Quantifying Portable Genetic Effects and Improving Cross-Ancestry Genetic Prediction with GWAS Summary Statistics

Improving polygenic prediction in ancestrally diverse populations

PRS-GRID: A Cross and Within Ancestry Polygenic Risk Prediction Method Based on Individual Genetic Distance

JointPRS: A Data-Adaptive Framework for Multi-Population Genetic Risk Prediction Incorporating Genetic Correlation

Risk assessment for colorectal cancer via polygenic risk score and lifestyle exposure: a large-scale association study of East Asian and European populations

An Ensemble Penalized Regression Method for Multi-ancestry Polygenic Risk Prediction

MultiPopPred: A Trans-Ethnic Disease Risk Prediction Method, and its Application to the South Asian Population

Polygenic risk score portability for common diseases across genetically diverse populations

All of Us diversity and scale improve polygenic prediction contextually with greatest improvements for under-represented populations

Multiethnic polygenic risk prediction in diverse populations through transfer learning

The expected polygenic risk score (ePRS) framework: an equitable metric for quantifying polygenetic risk via modeling of ancestral makeup

Methodologies underpinning polygenic risk scores estimation: a comprehensive overview

Polygenic risk scores for prediction of breast cancer risk in Asian populations

Identifying and characterizing disease subpopulations that most benefit from polygenic risk scores

Principles and methods for transferring polygenic risk scores across global populations

Polygenic Prediction of Type 2 Diabetes in Africa

Applying polygenic risk score methods to pharmacogenomics GWAS: challenges and opportunities

Optimization of Multi-Ancestry Polygenic Risk Score Disease Prediction Models