Predicting early Alzheimer's with blood biomarkers and clinical features

Muaath Ebrahim AlMansoori,Sherlyn Jemimah,Ferial Abuhantash,Aamna AlShehhi
DOI: https://doi.org/10.1038/s41598-024-56489-1
IF: 4.6
2024-03-14
Scientific Reports
Abstract:Alzheimer's disease (AD) is an incurable neurodegenerative disorder that leads to dementia. This study employs explainable machine learning models to detect dementia cases using blood gene expression, single nucleotide polymorphisms (SNPs), and clinical data from Alzheimer's Disease Neuroimaging Initiative (ADNI). Analyzing 623 ADNI participants, we found that the Support Vector Machine classifier with Mutual Information (MI) feature selection, trained on all three data modalities, achieved exceptional performance (accuracy = 0.95, AUC = 0.94). When using gene expression and SNP data separately, we achieved very good performance (AUC = 0.65, AUC = 0.63, respectively). Using SHapley Additive exPlanations (SHAP), we identified significant features, potentially serving as AD biomarkers. Notably, genetic-based biomarkers linked to axon myelination and synaptic vesicle membrane formation could aid early AD detection. In summary, this genetic-based biomarker approach, integrating machine learning and SHAP, shows promise for precise AD diagnosis, biomarker discovery, and offers novel insights for understanding and treating the disease. This approach addresses the challenges of accurate AD diagnosis, which is crucial given the complexities associated with the disease and the need for non-invasive diagnostic methods.
multidisciplinary sciences
What problem does this paper attempt to address?
The paper attempts to address the problem of predicting the early stages of Alzheimer's disease (AD) using blood biomarkers and clinical features. Specifically, the researchers aim to detect Alzheimer's disease cases by using gene expression data, single nucleotide polymorphism (SNP) data, and clinical data combined with machine learning models. The main objectives of the study include: 1. **Improving diagnostic accuracy**: By integrating multiple data modalities (gene expression, SNP, and clinical data), the goal is to enhance the prediction accuracy for Alzheimer's disease. 2. **Discovering new biomarkers**: Identifying genes and clinical features that could serve as biomarkers for Alzheimer's disease. 3. **Enhancing model interpretability**: Using SHapley Additive exPlanations (SHAP) to improve the interpretability of the model, helping clinicians better understand the decision-making process of the model. The study results show that when all three data modalities (gene expression, SNP, and clinical data) are combined, the Support Vector Machine (SVM) classifier combined with the Mutual Information (MI) feature selection method achieved the best performance (accuracy of 0.95, AUC of 0.94). This indicates that the method has high potential for early Alzheimer's disease diagnosis.