Predicting diagnostic progression to schizophrenia or bipolar disorder via machine learning applied to electronic health record data

Lasse Hansen,Martin Bernstorff,Kenneth Enevoldsen,Sara Kolding,Jakob Grøhn Damgaard,Erik Perfalk,Kristoffer Laigaard Nielbo,Andreas Aalkjaer Danielsen,Søren Dinesen Østergaard
DOI: https://doi.org/10.1101/2024.07.02.24309828
2024-07-03
Abstract:Importance The diagnosis of schizophrenia and bipolar disorder is often delayed several years despite illness typically emerging in late adolescence or early adulthood, which impedes initiation of targeted treatment. Objective To investigate whether machine learning models trained on routine clinical data from electronic health records (EHRs) can predict diagnostic progression to schizophrenia or bipolar disorder among patients undergoing treatment in psychiatric services for other mental illness. Design Cohort study based on data from EHRs. Setting The psychiatric services of the Central Denmark Region. Participants All patients between ≥15 and <60 years with at least one contact with the psychiatric services of the Central Denmark Region between 2011 and 2021. Patients with only a single contact were removed, leaving a total of 24,449 eligible patients with 398,922 outpatient contacts with the psychiatric services. Exposures Predictors based on EHR data, including medications, diagnoses, and clinical notes. Main Outcomes and Measures Diagnostic transition to schizophrenia or bipolar disorder within 5 years, predicted one day before outpatient contacts by means of regularized logistic regression and Extreme Gradient Boosting (XGBoost) models. Results Transition to the first occurrence of either schizophrenia or bipolar disorder was predicted by the XGBoost model with an area under the receiver operating characteristics curve (AUROC) of 0.70 on the training set, and 0.64 on the test set which consisted of two held-out hospital sites. At a predicted positive rate of 4%, the XGBoost model had a sensitivity of 9.3%, a specificity of 96.3%, and a positive predictive value of 13.0%. Predicting schizophrenia and bipolar disorder separately yielded AUROCs of 0.80 and 0.62, respectively, on the test set. The clinical notes proved particularly informative for prediction. Conclusions and relevance It is possible to predict diagnostic transition to schizophrenia and bipolar disorder from routine clinical data extracted from EHRs, with schizophrenia being notably easier to predict than bipolar disorder.
Psychiatry and Clinical Psychology
What problem does this paper attempt to address?