Efficient Explainable Models for Alzheimer's Disease Classification with Feature Selection and Data Balancing Approach Using Ensemble Learning

Yogita Dubey,Aditya Bhongade,Prachi Palsodkar,Punit Fulzele
DOI: https://doi.org/10.3390/diagnostics14242770
IF: 3.6
2024-12-11
Diagnostics
Abstract:Background: Alzheimer's disease (AD) is a progressive neurodegenerative disorder and is the most common cause of dementia. Early diagnosis of Alzheimer's disease is critical for better management and treatment outcomes, but it remains a challenging task due to the complex nature of the disease. Clinical data, including a range of cognitive, functional, and demographic variables, play a crucial role in Alzheimer's disease classification. Also, challenges such as data imbalance and high-dimensional feature sets often hinder model performance. Objective: This paper aims to propose a computationally efficient, reliable, and transparent machine learning-based framework for the classification of Alzheimer's disease patients. This framework is interpretable and helps medical practitioners learn complex patterns in patients. Method: This study addresses these issues by employing boosting algorithms, for enhanced classification accuracy. To mitigate data imbalance, a random sampling technique is applied, ensuring a balanced representation of Alzheimer's and healthy cases. Extensive feature analysis was conducted to identify the most impactful clinical features followed by feature reduction techniques to focus on the most informative clinical features, reducing model complexity and overfitting risks. Explainable AI tools, such as SHAP, LIME, ALE, and ELI5 are integrated to provide transparency into the model's decision-making process, highlighting key features influencing the classification and allowing clinicians to understand and trust the key features driving the predictions. Results: This approach results in a robust, interpretable, and clinically relevant framework for Alzheimer's disease diagnosis. The proposed approach achieved the best accuracy of 95%, demonstrating its effectiveness and potential for reliable early diagnosis of Alzheimer's disease. Conclusions: This study demonstrates that integrating ensemble learning algorithms and explainable AI, while using a balanced dataset with feature selection, improves quantitative results and interpretability. This approach offers a promising method for early and better-informed clinical decisions.
medicine, general & internal
What problem does this paper attempt to address?