Explainable machine learning-based prediction of depression severity in medical students

Dianhao Liu,Zequn Chen,Wesley J. Marrero,Nicholas C Jacobson,Thomas Thesen
DOI: https://doi.org/10.1101/2023.12.14.23299975
2024-09-24
Abstract:Importance: Medical students exhibit depression or depressive symptoms at a higher rate than the general population, with a potential for reduced academic performance and increased risk of suicide and physical health problems. Understanding the factors contributing to depression severity is critical for early detection and creating support systems personalized to the mental health challenges of medical students. Objective: To predict and identify the key biopsychosocial factors influencing the severity of depression in medical students across the United States using explainable machine learning algorithms. Design: This prognostic study is built upon survey data from the Healthy Minds Study student survey spanning the academic years of 2009 to 2021 across the US. Students' depression severity is measured using the Patient Health Questionnaire-9 (PHQ-9). Setting: All undergraduate and graduate students from the American universities participating in the Healthy Minds Study are included in our study. While our target cohort is composed of medical students only, we include undergraduate and graduate students in our population to increase the size of our data during the machine learning training process. Participants: A total of 167,999 students completed the PHQ-9 inventory, including 2,174 medical students. Main Outcome and Measure: Prediction accuracy is measured by the mean absolute error across a previously unseen cohort of medical students. This metric estimates the average absolute difference between the predicted and actual responses of the general medical student population. Results: By testing several predictive machine learning algorithms, we found that a sequence of binary XGBoost models achieved the lowest mean absolute error amongst all interpretable algorithms. Depression severity was best predicted by factors in the following order of importance: 1. history of depressive diagnosis, 2. disordered eating behaviors, 3. current financial stress, and 4. younger age. Conclusions and Relevance: By identifying predictors of student risk for developing depressive symptoms, our findings can help facilitate early identification of medical students at risk for developing depressive symptoms and providing them with early support and proactive strategies to prevent future depressive episodes.
Psychiatry and Clinical Psychology
What problem does this paper attempt to address?