Developing and Validating Polygenic Risk Scores for Colorectal Cancer Risk Prediction in East Asians

Jie Ping,Yaohua Yang,Wanqing Wen,Sun-Seog Kweon,Koichi Matsuda,Wei-Hua Jia,Aesun Shin,Yu-Tang Gao,Keitaro Matsuo,Jeongseon Kim,Dong-Hyun Kim,Sun Ha Jee,Qiuyin Cai,Zhishan Chen,Ran Tao,Min-Ho Shin,Chizu Tanikawa,Zhi-Zhong Pan,Jae Hwan Oh,Isao Oze,Yoon-Ok Ahn,Keum Ji Jung,Zefang Ren,Xiao-Ou Shu,Jirong Long,Wei Zheng
DOI: https://doi.org/10.1002/ijc.34194
2022-01-01
MedRxiv
Abstract:Several polygenic risk scores (PRSs) have been developed to predict the risk of colorectal cancer (CRC) in European descendants. We used genome-wide association study (GWAS) data from 22 702 cases and 212 486 controls of Asian ancestry to develop PRSs and validated them in two case-control studies (1454 Korean and 1736 Chinese). Eleven PRSs were derived using three approaches: GWAS-identified CRC risk SNPs, CRC risk variants identified through fine-mapping of known risk loci and genome-wide risk prediction algorithms. Logistic regression was used to estimate odds ratios (ORs) and area under the curve (AUC). PRS115-EAS, a PRS with 115 GWAS-reported risk variants derived from East-Asian data, validated significantly better than PRS115-EUR derived from European descendants. In the Korea validation set, OR per SD increase of PRS115-EAS was 1.63 (95% CI = 1.46-1.82; AUC = 0.63), compared with OR of 1.44 (95% CI = 1.29-1.60, AUC = 0.60) for PRS115-EUR. PRS115-EAS/EUR derived using meta-analysis results of both populations slightly improved the AUC to 0.64. Similar but weaker associations were found in the China validation set. Individuals among the highest 5% of PRS115-EAS/EUR have a 2.52-fold elevated CRC risk compared with the medium (41-60th) risk group and have a 12% to 20% risk of developing CRC by age 85. PRSs constructed using results from fine-mapping and genome-wide algorithms did not perform as well as PRS115-EAS and PRS115-EAS/EUR in risk prediction, possibly due to a small sample size. Our results indicate that CRC PRSs are promising in predicting CRC risk in East Asians and highlights the importance of using population-specific data to build CRC risk prediction models.
What problem does this paper attempt to address?