Deep Learning-Based Polygenic Risk Analysis for Alzheimer’s Disease Prediction

Xiaopu Zhou,Yu Chen,Fanny C. F. Ip,Yuanbing Jiang,Han Cao,Ge Lv,Huan Zhong,Jiahang Chen,Tao Ye,Yuewen Chen,Yulin Zhang,Shuangshuang Ma,Ronnie M. N. Lo,Estella P. S. Tong,Michael W. Weiner,Vincent C. T. Mok,Timothy C. Y. Kwok,Qihao Guo,Kin Y. Mok,Maryam Shoai,John Hardy,Lei Chen,Amy K. Y. Fu,Nancy Y. Ip
DOI: https://doi.org/10.1038/s43856-023-00269-x
2023-01-01
Communications Medicine
Abstract:Background The polygenic nature of Alzheimer’s disease (AD) suggests that multiple variants jointly contribute to disease susceptibility. As an individual’s genetic variants are constant throughout life, evaluating the combined effects of multiple disease-associated genetic risks enables reliable AD risk prediction. Because of the complexity of genomic data, current statistical analyses cannot comprehensively capture the polygenic risk of AD, resulting in unsatisfactory disease risk prediction. However, deep learning methods, which capture nonlinearity within high-dimensional genomic data, may enable more accurate disease risk prediction and improve our understanding of AD etiology. Accordingly, we developed deep learning neural network models for modeling AD polygenic risk. Methods We constructed neural network models to model AD polygenic risk and compared them with the widely used weighted polygenic risk score and lasso models. We conducted robust linear regression analysis to investigate the relationship between the AD polygenic risk derived from deep learning methods and AD endophenotypes (i.e., plasma biomarkers and individual cognitive performance). We stratified individuals by applying unsupervised clustering to the outputs from the hidden layers of the neural network model. Results The deep learning models outperform other statistical models for modeling AD risk. Moreover, the polygenic risk derived from the deep learning models enables the identification of disease-associated biological pathways and the stratification of individuals according to distinct pathological mechanisms. Conclusion Our results suggest that deep learning methods are effective for modeling the genetic risks of AD and other diseases, classifying disease risks, and uncovering disease mechanisms.
What problem does this paper attempt to address?