Bayesian approach to assessing population differences in genetic risk of disease with application to prostate cancer

Iain R. Timmins,Frank Dudbridge,The PRACTICAL Consortium
DOI: https://doi.org/10.1371/journal.pgen.1011212
IF: 4.5
2024-04-18
PLoS Genetics
Abstract:Population differences in risk of disease are common, but the potential genetic basis for these differences is not well understood. A standard approach is to compare genetic risk across populations by testing for mean differences in polygenic scores, but existing studies that use this approach do not account for statistical noise in effect estimates (i.e., the GWAS betas) that arise due to the finite sample size of GWAS training data. Here, we show using Bayesian polygenic score methods that the level of uncertainty in estimates of genetic risk differences across populations is highly dependent on the GWAS training sample size, the polygenicity (number of causal variants), and genetic distance (F ST ) between the populations considered. We derive a Wald test for formally assessing the difference in genetic risk across populations, which we show to have calibrated type 1 error rates under a simplified assumption that all SNPs are independent, which we achieve in practise using linkage disequilibrium (LD) pruning. We further provide closed-form expressions for assessing the uncertainty in estimates of relative genetic risk across populations under the special case of an infinitesimal genetic architecture. We suggest that for many complex traits and diseases, particularly those with more polygenic architectures, current GWAS sample sizes are insufficient to detect moderate differences in genetic risk across populations, though more substantial differences in relative genetic risk (relative risk > 1.5) can be detected. We show that conventional approaches that do not account for sampling error from the training sample, such as using a simple t -test, have very high type 1 error rates. When applying our approach to prostate cancer, we demonstrate a higher genetic risk in African Ancestry men, with lower risk in men of European followed by East Asian ancestry. Many diseases and complex traits, such as prostate cancer, exhibit differences in incidence across populations. Yet the potential contribution of genetic factors towards such disparities is unclear. Polygenic scores summarise genetic effects across the genome and can in principle provide a valuable tool for assessing and comparing disease risk across populations. In practise, current approaches based on polygenic scores assume that such scores perfectly measure genetic risk of disease without measurement error, and thus do not account for uncertainty that arises in the construction of the score from a finite genome-wide association study (GWAS) training sample, which can be substantial. We introduce a Bayesian approach based on the LDpred2 polygenic score model that accounts fully for training sample uncertainty, and we propose a Wald test for formally testing such genetic risk differences across populations. Simulations show that the method properly controls for type 1 errors assuming independent SNPs (achieved by pruning), and that statistical power is sensitive to both the genetic architecture (heritability and polygenicity) and training sample size. In application to prostate cancer, this framework enables us to identify a higher genetic risk in African Ancestry men, with lower risk in men of European followed by East Asian ancestry.
genetics & heredity
What problem does this paper attempt to address?