Causal Forest Machine Learning Analysis of Parkinson's Disease in Resting-State Functional Magnetic Resonance Imaging

Gabriel Solana-Lavalle,Michael D Cusimano,Thomas Steeves,Roberto Rosas-Romero,Pascal N Tyrrell
DOI: https://doi.org/10.3390/tomography10060068
2024-06-06
Tomography
Abstract:In recent years, Artificial Intelligence has been used to assist healthcare professionals in detecting and diagnosing neurodegenerative diseases. In this study, we propose a methodology to analyze functional Magnetic Resonance Imaging signals and perform classification between Parkinson's disease patients and healthy participants using Machine Learning algorithms. In addition, the proposed approach provides insights into the brain regions affected by the disease. The functional Magnetic Resonance Imaging from the PPMI and 1000-FCP datasets were pre-processed to extract time series from 200 brain regions per participant, resulting in 11,600 features. Causal Forest and Wrapper Feature Subset Selection algorithms were used for dimensionality reduction, resulting in a subset of features based on their heterogeneity and association with the disease. We utilized Logistic Regression and XGBoost algorithms to perform PD detection, achieving 97.6% accuracy, 97.5% F1 score, 97.9% precision, and 97.7%recall by analyzing sets with fewer than 300 features in a population including men and women. Finally, Multiple Correspondence Analysis was employed to visualize the relationships between brain regions and each group (women with Parkinson, female controls, men with Parkinson, male controls). Associations between the Unified Parkinson's Disease Rating Scale questionnaire results and affected brain regions in different groups were also obtained to show another use case of the methodology. This work proposes a methodology to (1) classify patients and controls with Machine Learning and Causal Forest algorithm and (2) visualize associations between brain regions and groups, providing high-accuracy classification and enhanced interpretability of the correlation between specific brain regions and the disease across different groups.
What problem does this paper attempt to address?