MRI-Based Multi-Class Relevance Vector Machine Classification of Neurodegenerative Diseases
Kyan Younes,Yann Cobigo,Amy Wolf,John Kornak,Katherine P Rankin,Mirza Faisal Beg,Lei Wang,Howard J Rosen
DOI: https://doi.org/10.1101/2024.10.07.24315054
2024-10-08
Abstract:Machine learning algorithms are a promising automated candidate that can help mitigate the growing need for dementia experts. Despite
the substantial development in MRI-based machine learning analyses, case misclassification is a universal finding, yet the
reasons behind misclassification are poorly understood. We implemented a multi-class classification approach that uses relevance
vector machine and logistic classification to classify research participants based on their whole-brain T1-weighted MRI scans. A
total of 468 participants from seven diagnostic classes were included: 144 healthy controls, 84 Alzheimer`s disease, 108 behavioral
variant frontotemporal dementia (bvFTD), 30 semantic variant primary progressive aphasia (svPPA), 30 non-fluent variant
primary progressive aphasia (nfvPPA), 30 corticobasal syndrome (CBS), and 42 progressive supranuclear palsy syndrome (PSPS).
We compared the algorithm`s diagnostic accuracy against the clinical, pathological, genetic, and quantitative imaging data. The
exact neurodegenerative syndrome was predicted in 71% of the cases, the neurodegenerative disease spectrum was predicted in
80% of the cases, and the algorithm distinguished controls from any dementia in 85% of the cases. The algorithm showed high
performance in diagnosing healthy controls, moderate performance in diagnosing AD, bvFTD, and svPPA, and low performance
in diagnosing CBS, nfvPPA, and PSPS. Based on the quantitative imaging data, most of the misclassified neurodegenerative cases
had minimal atrophy and brain volumes comparable to healthy controls. In AD, early-onset AD cases with minimal brain atrophy
represented most of the misclassified cases. In bvFTD, FTD genetic mutation carriers (predominantly C9orf72 repeat expansion),
FTD phenocopy, patients meeting only possible bvFTD criteria represented most misclassified cases. Case misclassification in
machine learning studies in neurodegenerative diseases results from neurodegenerative disease heterogeneity and the limitations of
structural MRI`s ability to capture the whole gamut of biological changes. Larger and more inclusive datasets that are representative
of population biologic heterogeneity are needed to train better machine learning techniques, and a margin of error is expected and
should be acceptable, like the uncertainty of a clinical diagnosis by a dementia expert.