Evaluation of machine learning-based classification of clinical impairment and prediction of clinical worsening in multiple sclerosis

Samantha Noteboom,Moritz Seiler,Claudia Chien,Roshan P. Rane,Frederik Barkhof,Eva M. M. Strijbis,Friedemann Paul,Menno M. Schoonheim,Kerstin Ritter
DOI: https://doi.org/10.1007/s00415-024-12507-w
2024-06-24
Journal of Neurology
Abstract:Background Robust predictive models of clinical impairment and worsening in multiple sclerosis (MS) are needed to identify patients at risk and optimize treatment strategies. Objective To evaluate whether machine learning (ML) methods can classify clinical impairment and predict worsening in people with MS (pwMS) and, if so, which combination of clinical and magnetic resonance imaging (MRI) features and ML algorithm is optimal. Methods We used baseline clinical and structural MRI data from two MS cohorts (Berlin: n = 125, Amsterdam: n = 330) to evaluate the capability of five ML models in classifying clinical impairment at baseline and predicting future clinical worsening over a follow-up of 2 and 5 years. Clinical worsening was defined by increases in the Expanded Disability Status Scale (EDSS), Timed 25-Foot Walk Test (T25FW), 9-Hole Peg Test (9HPT), or Symbol Digit Modalities Test (SDMT). Different combinations of clinical and volumetric MRI measures were systematically assessed in predicting clinical outcomes. ML models were evaluated using Monte Carlo cross-validation, area under the curve (AUC), and permutation testing to assess significance. Results The ML models significantly determined clinical impairment at baseline for the Amsterdam cohort, but did not reach significance for predicting clinical worsening over a follow-up of 2 and 5 years. High disability (EDSS ≥ 4) was best determined by a support vector machine (SVM) classifier using clinical and global MRI volumes (AUC = 0.83 ± 0.07, p = 0.015). Impaired cognition (SDMT Z -score ≤ −1.5) was best determined by a SVM using regional MRI volumes (thalamus, ventricles, lesions, and hippocampus), reaching an AUC of 0.73 ± 004 ( p = 0.008). Conclusion ML models could aid in classifying pwMS with clinical impairment and identify relevant biomarkers, but prediction of clinical worsening is an unmet need.
clinical neurology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to classify the clinical damage of patients with multiple sclerosis (MS) through machine - learning methods and predict their clinical deterioration. Specifically, the research objectives are as follows: 1. **Evaluate the effectiveness of machine - learning methods**: The research aims to evaluate whether machine - learning (ML) methods can effectively classify the clinical damage of patients with MS (pwMS) and predict the clinical deterioration in the next 2 and 5 years. 2. **Determine the optimal combination**: If the machine - learning methods are effective, the research will further explore which combination of clinical and magnetic resonance imaging (MRI) features and which machine - learning algorithm is the best. ### Research background Multiple sclerosis is a chronic inflammatory, demyelinating, and neurodegenerative disease, and its disease course is heterogeneous and unpredictable. In order to monitor disease progression and optimize treatment strategies, effective prognostic biomarkers are urgently needed. Traditional MRI inflammatory markers (such as white matter lesion counts) have been widely used in the diagnosis and monitoring of MS, but these markers have limited explanatory power in explaining the severity of symptoms and predicting clinical progress. In contrast, neurodegenerative markers in MRI are more closely related to clinical outcomes and are considered to be the main driving factors for irreversible disability. ### Research objectives 1. **Classify clinical damage**: Use machine - learning methods to classify the baseline clinical damage of patients with MS. 2. **Predict clinical deterioration**: Based on baseline clinical and structural MRI data, predict the clinical deterioration in the next 2 and 5 years. ### Methods - **Data sources**: The research used baseline clinical and structural MRI data from two MS cohorts in Berlin (n = 125) and Amsterdam (n = 330). - **Machine - learning models**: The performance of five machine - learning models (logistic regression, support vector machine, gradient boosting, random forest) in classifying clinical damage and predicting clinical deterioration was evaluated. - **Evaluation indicators**: Monte Carlo cross - validation, area under the curve (AUC), and permutation tests were used to evaluate the significance of the models. ### Results - **Classify clinical damage**: - For the Amsterdam cohort, the machine - learning models significantly distinguished clinical damage at baseline. - The best classifier for high disability (EDSS ≥ 4) was the support vector machine (SVM) using clinical and whole - brain MRI volumes, with an AUC of 0.83 ± 0.07, p = 0.015. - The best classifier for cognitive impairment (SDMT Z - score ≤ - 1.5) was the support vector machine using regional MRI volumes (thalamus, ventricles, lesions, and hippocampus), with an AUC of 0.73 ± 0.04, p = 0.008. - **Predict clinical deterioration**: - The machine - learning models failed to significantly predict the clinical deterioration during the 2 - and 5 - year follow - up periods. - Although some models showed a relatively high AUC in predicting EDSS deterioration (for example, SVM - RBF using whole - brain MRI volumes as input features, AUC = 0.73 ± 0.13), they did not reach significance after the permutation test (p = 0.163). ### Conclusion The machine - learning models perform well in classifying the clinical damage of patients with MS and can identify relevant biomarkers. However, predicting clinical deterioration remains an unmet need. This indicates that although machine - learning shows potential in classification tasks, it still faces challenges in predicting long - term clinical deterioration.