Identifying the predictive effectiveness of a genetic risk score for incident hypertension using machine learning methods among populations in rural China
Miaomiao Niu,Yikang Wang,Liying Zhang,Runqi Tu,Xiaotian Liu,Jian Hou,Wenqian Huo,Zhenxing Mao,Chongjian Wang,Ronghai Bie
DOI: https://doi.org/10.1038/s41440-021-00738-7
2021-09-03
Hypertension Research
Abstract:Current studies have shown the controversial effect of genetic risk scores (GRSs) in hypertension prediction. Machine learning methods are used extensively in the medical field but rarely in the mining of genetic information. This study aims to determine whether genetic information can improve the prediction of incident hypertension using machine learning approaches in a prospective study. The study recruited 4592 subjects without hypertension at baseline from a cohort study conducted in rural China. A polygenic risk score (PGGRS) was calculated using 13 SNPs. According to a ratio of 7:3, subjects were randomly allocated to the train and test datasets. Models with and without the PGGRS were established using the train dataset with Cox regression, artificial neural network (ANN), random forest (RF), and gradient boosting machine (GBM) methods. The discrimination and reclassification of models were estimated using the test dataset. The PGGRS showed a significant association with the risk of incident hypertension (HR (95% CI), 1.046 (1.004, 1.090), P = 0.031) irrespective of baseline blood pressure. Models that did not include the PGGRS achieved AUCs (95% CI) of 0.785 (0.763, 0.807), 0.790 (0.768, 0.811), 0.838 (0.817, 0.857), and 0.854 (0.835, 0.873) for the Cox, ANN, RF, and GBM methods, respectively. The addition of the PGGRS led to the improvement of the AUC by 0.001, 0.008, 0.023, and 0.017; IDI by 1.39%, 2.86%, 4.73%, and 4.68%; and NRI by 25.05%, 13.01%, 44.87%, and 22.94%, respectively. Incident hypertension risk was better predicted by the traditional+PGGRS model, especially when machine learning approaches were used, suggesting that genetic information may have the potential to identify new hypertension cases using machine learning methods in resource-limited areas.
peripheral vascular disease