Assessing polyomic risk to predict Alzheimer's disease using a machine learning model

Tiffany Ngai,Julian Willett,Mohammad Waqas,Lucas H. Fishbein,Younjung Choi,Georg Hahn,Kristina Mullin,Christoph Lange,Julian Hecker,Rudolph E. Tanzi,Dmitry Prokopenko
DOI: https://doi.org/10.1002/alz.14319
2024-11-09
Alzheimer s & Dementia
Abstract:INTRODUCTION Alzheimer's disease (AD) is the most common form of dementia in the elderly. Given that AD neuropathology begins decades before symptoms, there is a dire need for effective screening tools for early detection of AD to facilitate early intervention. METHODS Here, we used tree‐based and deep learning methods to train polyomic prediction models for AD affection status and age at onset, employing genomic, proteomic, metabolomic, and drug use data from UK Biobank. We used SHAP to determine the feature's importance. RESULTS Our best‐performing polyomic model achieved an area under the receiver operating characteristics curve (AUROC) of 0.87. We identified GFAP and CXCL17 proteins to be the strongest predictors of AD, besides apolipoprotein E (APOE) alleles. Increasing the number of cases by including "AD‐by‐proxy" cases did not improve AD prediction. DISCUSSION Among the four modalities, genomics, and proteomics were the most informative modality based on AUROC (area under the receiver operating characteristic curve). Our data suggest that two blood‐based biomarkers (glial fibrillary acidic protein [GFAP] and CXCL17) may be effective for early presymptomatic prediction of AD. Highlights We developed a polyomic model to predict AD and age‐at‐onset using omics and medication use data from EHR. We identified GFAP and CXCL17 proteins to be the strongest predictors of AD, besides APOE alleles. "AD‐by‐proxy" cases, if used in training, do not improve AD prediction. Proteomics was the most informative modality overall for affection status and AAO prediction.
clinical neurology
What problem does this paper attempt to address?