Classifying Alzheimer's disease and normal subjects using machine learning techniques and genetic-environmental features

Yu-Hua Huang,Yi-Chun Chen,Wei-Min Ho,Ren-Guey Lee,Ren-Hua Chung,Yu-Li Liu,Pi-Yueh Chang,Shih-Cheng Chang,Chaung-Wei Wang,Wen-Hung Chung,Shih-Jen Tsai,Po-Hsiu Kuo,Yun-Shien Lee,Chun-Chieh Hsiao
DOI: https://doi.org/10.1016/j.jfma.2023.10.021
Abstract:Background: Alzheimer's disease (AD) is complicated by multiple environmental and polygenetic factors. The accuracy of artificial neural networks (ANNs) incorporating the common factors for identifying AD has not been evaluated. Methods: A total of 184 probable AD patients and 3773 healthy individuals aged 65 and over were enrolled. AD-related genes (51 SNPs) and 8 environmental factors were selected as features for multilayer ANN modeling. Random Forest (RF) and Support Vector Machine with RBF kernel (SVM) were also employed for comparison. Model results were verified using traditional statistics. Results: The ANN achieved high accuracy (0.98), sensitivity (0.95), and specificity (0.96) in the intrinsic test for AD classification. Excluding age and genetic data still yielded favorable results (accuracy: 0.97, sensitivity: 0.94, specificity: 0.96). The assigned weights to ANN features highlighted the importance of mental evaluation, years of education, and specific genetic variations (CASS4 rs7274581, PICALM rs3851179, and TOMM40 rs2075650) for AD classification. Receiver operating characteristic analysis revealed AUC values of 0.99 (intrinsic test), 0.60 (TWB-GWA), and 0.72 (CG-WGS), with slightly lower AUC values (0.96, 0.80, 0.52) when excluding age in ANN. The performance of the ANN model in AD classification was comparable to RF, SVM (linear kernel), and SVM (RBF kernel). Conclusion: The ANN model demonstrated good sensitivity, specificity, and accuracy in AD classification. The top-weighted SNPs for AD prediction were CASS4 rs7274581, PICALM rs3851179, and TOMM40 rs2075650. The ANN model performed similarly to RF and SVM, indicating its capability to handle the complexity of AD as a disease entity.
What problem does this paper attempt to address?