Comparative analysis of machine learning algorithms for Alzheimer's disease classification using EEG signals and genetic information

Wei-Yang Yu,Ting-Hsuan Sun,Kai-Cheng Hsu,Chia-Chun Wang,Shang-Yu Chien,Chon-Haw Tsai,Yu-Wan Yang
DOI: https://doi.org/10.1016/j.compbiomed.2024.108621
Abstract:Alzheimer's disease (AD) is a progressive neurodegenerative disorder characterized by cognitive decline, memory impairments, and behavioral changes. The presence of abnormal beta-amyloid plaques and tau protein tangles in the brain is known to be associated with AD. However, current limitations of imaging technology hinder the direct detection of these substances. Consequently, researchers are exploring alternative approaches, such as indirect assessments involving monitoring brain signals, cognitive decline levels, and blood biomarkers. Recent studies have highlighted the potential of integrating genetic information into these approaches to enhance early detection and diagnosis, offering a more comprehensive understanding of AD pathology beyond the constraints of existing imaging methods. Our study utilized electroencephalography (EEG) signals, genotypes, and polygenic risk scores (PRSs) as features for machine learning models. We compared the performance of gradient boosting (XGB), random forest (RF), and support vector machine (SVM) to determine the optimal model. Statistical analysis revealed significant correlations between EEG signals and clinical manifestations, demonstrating the ability to distinguish the complexity of AD from other diseases by using genetic information. By integrating EEG with genetic data in an SVM model, we achieved exceptional classification performance, with an accuracy of 0.920 and an area under the curve of 0.916. This study presents a novel approach of utilizing real-time EEG data and genetic background information for multimodal machine learning. The experimental results validate the effectiveness of this concept, providing deeper insights into the actual condition of patients with AD and overcoming the limitations associated with single-oriented data.
What problem does this paper attempt to address?