Abstract:Background and objective Caused by shared genetic risk factors and similar neuropsychological symptoms, bipolar disorder (BD) and major depressive disorder (MDD) are at high risk of misdiagnosis, which is associated with ineffective treatment and worsening of outcomes. We aimed to develop a machine learning (ML)-based diagnostic system, based on electronic medical records (EMR) data, to mimic the clinical reasoning of human physicians to differentiate MDD and BD (especially BD depressive episodes) patients about to be admitted to a hospital and, hence, reduce the misdiagnosis of BD as MDD on admission. In addition, we examined to what extent our ML model could be made interpretable by quantifying and visualizing the features that drive the predictions. Methods By identifying 16311 patients admitted to a hospital located in western China between 2009 and 2018 with a recorded main diagnosis of MDD or BD, we established three sub-cohorts with different combinations of features for both the MDD-BD cohort and the MDD-BD depressive episodes cohort, respectively. Four different ML algorithms (logistic regression, extreme gradient boosting (XGBoost), random forest, and support vector machine) and four train-test splits were used to train and validate diagnostic models, and explainable methods (SHAP and Break Down) were utilized to analyze the contribution of each of the features at both population-level and individual-level, including feature importance, feature interaction, and feature effect on prediction decision for a specific subject. Results The XGBoost algorithm provided the best test performance (AUC: 0.838 (0.810-0.867), PPV: 0.810 and NPV: 0.834) for separating patients with BD from those with MDD. Core predictors included symptoms (mood-up, exciting, bad sleep, loss of interest, talking, mood-down, provoke), along with age, job, myocardial enzyme markers (creatine kinase, hydroxybutyrate dehydrogenase), diabetes-associated marker (glucose), bone function marker (alkaline phosphatase), non-enzymatic antioxidant (uric acid), markers of immune/inflammation (white blood cell count, lymphocyte count, basophil percentage, monocyte count), cardiovascular function marker (low density lipoprotein), renal marker (total protein), liver biochemistry marker (indirect bilirubin), and vital signs like pulse. For separating patients with BD depressive episodes from those with MDD, the test AUC was 0.777 (0.732-0.822), with PPV 0.576 and NPV 0.899. Additional validation in models built with self-reported symptoms removed from the feature set, showed test AUC of 0.701 (0.666-0.736) for differentiating BD and MDD, and AUC of 0.564 (0.515-0.614) for detecting patients in BD depressive episodes from MDD patients. Validation in the datasets without removing the patients with comorbidity showed an AUC of 0.826 (0.806-0.846). Conclusion The diagnostic system accurately identified patients with BD in various clinical scenarios, and differences in patterns of peripheral markers between BD and MDD could enrich our understanding of potential underlying pathophysiological mechanisms of them.

Identification of Diagnostic Markers for Major Depressive Disorder Using Machine Learning Methods

Identification of Potential Biomarkers for Major Depressive Disorder: Based on Integrated Bioinformatics and Clinical Validation

Peripheral Blood Mononuclear Cell Biomarkers for Major Depressive Disorder: A Transcriptomic Approach

Microarray Analysis of the Major Depressive Disorder Mrna Profile Data.

A Machine Learning Analysis of Big Metabolomics Data for Classifying Depression: Model Development and Validation

A diagnostic model based on bioinformatics and machine learning to differentiate bipolar disorder from schizophrenia and major depressive disorder

Using an Interpretable Amino Acid-Based Machine Learning Method to Enhance the Diagnosis of Major Depressive Disorder

An integrated machine learning framework for developing and validating a diagnostic model of major depressive disorder based on interstitial cystitis-related genes

Improving Conceptual Understanding and Representation Skills Through Excel-Based Modeling

Error Analysis of an Algorithm for Magnetic Compensation of Aircraft

Diagnosis of Major Depressive Disorder Using Machine Learning Based on Multisequence MRI Neuroimaging Features

Sulfated glycoconjugates demonstrated in combination with high iron diamine thiocarbohydrazide-silver proteinate and silver acetate physical development.

Salivary gland malignancies: the role for chemotherapy and molecular targeted agents.

A Systematic Evaluation of Machine Learning–Based Biomarkers for Major Depressive Disorder

Predictive Markers of Depression in Hypertension.

Identification and Analyses of Crucial Genes Associated with Pathogenesis of Major Depressive Disorder

Unveiling the Connection between Microbiota and Depressive Disorder through Machine Learning

The alterations of brain functional connectivity networks in major depressive disorder detected by machine learning through multisite rs-fMRI data

Screening and identification of key biomarkers of depression using bioinformatics

Explainable machine-learning algorithms to differentiate bipolar disorder from major depressive disorder using self-reported symptoms, vital signs, and blood-based markers

Prediction of Probable Major Depressive Disorder in the Taiwan Biobank: An Integrated Machine Learning and Genome-Wide Analysis Approach