Multivariate Machine Learning Analyses in Identification of Major Depressive Disorder Using Resting-State Functional Connectivity: A Multicentral Study
Yachen Shi,Linhai Zhang,Zan Wang,Xiang Lu,Tao Wang,Deyu Zhou,Zhijun Zhang
DOI: https://doi.org/10.1021/acschemneuro.1c00256
2021-01-01
ACS Chemical Neuroscience
Abstract:Diagnosis of major depressive disorder (MDD) using resting-state functional connectivity (rs-FC) data faces many challenges, such as the high dimensionality, small samples, and individual difference. To assess the clinical value of rs-FC in MDD and identify the potential rs-FC machine learning (ML) model for the individualized diagnosis of MDD, based on the rs-FC data, a progressive three-step ML analysis was performed, including six different ML algorithms and two dimension reduction methods, to investigate the classification performance of ML model in a multicentral, large sample dataset [1021 MDD patients and 1100 normal controls (NCs)]. Furthermore, the linear least-squares fitted regression model was used to assess the relationships between rs-FC features and the severity of clinical symptoms in MDD patients. Among used ML methods, the rs-FC model constructed by the eXtreme Gradient Boosting (XGBoost) method showed the optimal classification performance for distinguishing MDD patients from NCs at the individual level (accuracy = 0.728, sensitivity = 0.720, specificity = 0.739, area under the curve = 0.831). Meanwhile, identified rs-FCs by the XGBoost model were primarily distributed within and between the default mode network, limbic network, and visual network. More importantly, the 17 item individual Hamilton Depression Scale scores of MDD patients can be accurately predicted using rs-FC features identified by the XGBoost model (adjusted R2 = 0.180, root mean squared error = 0.946). The XGBoost model using rs-FCs showed the optimal classification performance between MDD patients and HCs, with the good generalization and neuroscientifical interpretability.