Correlation-based tests for the formal comparison of polygenic scores in multiple populations

Sophia Gunn,Kathryn L. Lunetta
DOI: https://doi.org/10.1371/journal.pgen.1011249
IF: 4.5
2024-04-27
PLoS Genetics
Abstract:Polygenic scores (PGS) are measures of genetic risk, derived from the results of genome wide association studies (GWAS). Previous work has proposed the coefficient of determination ( R 2 ) as an appropriate measure by which to compare PGS performance in a validation dataset. Here we propose correlation-based methods for evaluating PGS performance by adapting previous work which produced a statistical framework and robust test statistics for the comparison of multiple correlation measures in multiple populations. This flexible framework can be extended to a wider variety of hypothesis tests than currently available methods. We assess our proposed method in simulation and demonstrate its utility with two examples, assessing previously developed PGS for low-density lipoprotein cholesterol and height in multiple populations in the All of Us cohort. Finally, we provide an R package 'coranova' with both parametric and nonparametric implementations of the described methods. Polygenic scores (PGS) are measures of genetic risk of disease that have been widely embraced by the scientific community. While there are many methods available to develop PGS, we have limited tools by which to compare PGS performance. Previous work has proposed an R 2 -based approach which appropriately accounts for the correlation between PGS when comparing their performance. Here, we propose correlation-based tests which can assess multiple scores in multiple populations while accounting for the correlation between the scores. Our method is highly flexible and can be used by researchers to test any linear hypothesis of PGS performance, though we suggest three ANOVA-like tests as a starting point. We apply our method to PGS developed for LDL cholesterol and height in the All of Us cohort. In these examples, we demonstrate how our method can be used by researchers to compare and evaluate PGS in multiple populations. This approach will be particularly useful as we look to improve PGS performance in underrepresented populations in genetic research and need to evaluate PGS in multiple populations to appropriately assess PGS performance.
genetics & heredity
What problem does this paper attempt to address?