Allele frequency differences of causal variants have a major impact on low cross-ancestry portability of PRS

Saitou,M.,Dahl,A.,Wang,Q.,Liu,X.
DOI: https://doi.org/10.1101/2022.10.21.22281371
2022-10-26
MedRxiv
Abstract:Genome-wide association studies (GWAS) are overwhelmingly biased toward European ancestries. Nearly all existing studies agree that transferring genetic predictions from European ancestries to other populations results in a substantial loss of accuracy. This is commonly referred to as low portability of polygenic risk scores (PRS) and is one of the most important barriers to the ethical clinical deployment of PRS. Yet, it remains unclear how much various genetic factors, such as linkage disequilibrium (LD) differences, allele frequency differences or causal effect differences, contribute to low PRS portability. In this study, we used gene expression levels in lymphoblastoid cell lines (LCLs) as a simplified model of complex traits with minimal environmental variation, in order to understand how much each genetic factor contributes to PRS portability from European to African populations. We found that cis-genetic effects on gene expression are highly similar between European and African individuals ({rho} _g=0.95, S.E.=0.09). This stands in stark contrast to the very low estimates of cis-genetic correlation between Europeans and Africans in previous studies, which we demonstrate are artifacts of statistical bias. We showed that portability decreases with increasing LD differences in the cis-regions. We also found that allele frequency differences of causal variants have a striking impact on PRS portability. For example, PRS portability is reduced by more than 32% when the causal cis-variant is common (minor allele frequency, MAF > 5%) in European samples (training population) but is rarer (MAF < 5%) in African samples (prediction population). While large allele frequency differences can decrease PRS portability through increasing LD differences, we also show that causal allele frequency can significantly impact portability independently of LD. This observation suggests that improving statistical fine-mapping alone does not overcome the loss of portability caused by causal allele frequency differences. Lastly, we also found that causal allele frequency is the main genetic factor underlying differential gene expression levels across ancestries. We conclude that causal genetic effects are highly similar in Europeans and Africans, and low PRS portability is primarily due to allele frequency differences.
What problem does this paper attempt to address?