Machine Learning Models for Diagnosis of Parkinson's Disease Using Multiple Structural Magnetic Resonance Imaging Features.
Yang Ya,Lirong Ji,Yujing Jia,Nan Zou,Zhen Jiang,Hongkun Yin,Chengjie Mao,Weifeng Luo,Erlei Wang,Guohua Fan
DOI: https://doi.org/10.3389/fnagi.2022.808520
IF: 4.8
2022-01-01
Frontiers in Aging Neuroscience
Abstract:Purpose:This study aimed to develop machine learning models for the diagnosis of Parkinson's disease (PD) using multiple structural magnetic resonance imaging (MRI) features and validate their performance.Methods:Brain structural MRI scans of 60 patients with PD and 56 normal controls (NCs) were enrolled as development dataset and 69 patients with PD and 71 NCs from Parkinson's Progression Markers Initiative (PPMI) dataset as independent test dataset. First, multiple structural MRI features were extracted from cerebellar, subcortical, and cortical regions of the brain. Then, the Pearson's correlation test and least absolute shrinkage and selection operator (LASSO) regression were used to select the most discriminating features. Finally, using logistic regression (LR) classifier with the 5-fold cross-validation scheme in the development dataset, the cerebellar, subcortical, cortical, and a combined model based on all features were constructed separately. The diagnostic performance and clinical net benefit of each model were evaluated with the receiver operating characteristic (ROC) analysis and the decision curve analysis (DCA) in both datasets.Results:After feature selection, 5 cerebellar (absolute value of left lobule crus II cortical thickness (CT) and right lobule IV volume, relative value of right lobule VIIIA CT and lobule VI/VIIIA gray matter volume), 3 subcortical (asymmetry index of caudate volume, relative value of left caudate volume, and absolute value of right lateral ventricle), and 4 cortical features (local gyrification index of right anterior circular insular sulcus and anterior agranular insula complex, local fractal dimension of right middle insular area, and CT of left supplementary and cingulate eye field) were selected as the most distinguishing features. The area under the curve (AUC) values of the cerebellar, subcortical, cortical, and combined models were 0.679, 0.555, 0.767, and 0.781, respectively, for the development dataset and 0.646, 0.632, 0.690, and 0.756, respectively, for the independent test dataset, respectively. The combined model showed higher performance than the other models (Delong's test, all p-values < 0.05). All models showed good calibration, and the DCA demonstrated that the combined model has a higher net benefit than other models.Conclusion:The combined model showed favorable diagnostic performance and clinical net benefit and had the potential to be used as a non-invasive method for the diagnosis of PD.