Interpretable machine learning model for predicting the prognosis of antibody positive autoimmune encephalitis patients

Junshuang Guo,Ruirui Dong,Ruike Zhang,Fan Yang,Yating Wang,Wang Miao
DOI: https://doi.org/10.1016/j.jad.2024.10.010
2024-10-05
Abstract:Objective: The objective was to utilize nine machine learning (ML) methods to predict the prognosis of antibody positive autoimmune encephalitis (AE) patients. Methods: The encephalitis data from the Global Burden of Disease (GBD) study is analyzed to reflect the disease burden of encephalitis. This study included 187 patients with AE. 121 patients as training set and 67 patients as validation set. Decision trees (DT), random forest (RF), extreme gradient boosting (XGBoost), k-nearest neighbor (KNN), support vector machine (SVM), naive bayes (NB), neural network (NN), light gradient boosting machine (LGBM), and logistic regression (LR) are ML methods used to construct predictive models. The constructed models were validated for discrimination, calibration and clinical applicability using validation set data. Shapley additive explanation (SHAP) analysis was used to explain the model. Results: The number of encephalitis worldwide deaths, incidence and prevalence is increasing every year from 2010 to 2021. The training set included 121 patients with AE. Univariate analysis and LASSO screening identified six variables. The results of constructing models using 9 ML methods showed RF had the highest accuracy (0.860), followed by XGBoost (0.826), with F1 scores of 0.844 and 0.807, respectively. Validation set data showed good discrimination, calibration and clinical applicability of the model. The SHAP values of infection, CSF monocyte percentage, and prealbumin were 0.906, 0.790, and 0.644, respectively. Limitations: As a rare disease, the sample size of this study is relatively small. Conclusion: The model constructed using RF and XGBoost has good performance, good discrimination, calibration, clinical applicability, and interpretability.
What problem does this paper attempt to address?