Identifying data-driven subtypes of major depressive disorder with electronic health records

Abhishek Sharma,Pilar F Verhaak,Thomas H McCoy,Roy H Perlis,Finale Doshi-Velez,Pilar F. Verhaak,Thomas H. McCoy,Roy H. Perlis
DOI: https://doi.org/10.1016/j.jad.2024.03.162
IF: 6.533
2024-04-10
Journal of Affective Disorders
Abstract:Background Efforts to reduce the heterogeneity of major depressive disorder (MDD) by identifying subtypes have not yet facilitated treatment personalization or investigation of biology, so novel approaches merit consideration. Methods We utilized electronic health records drawn from 2 academic medical centers and affiliated health systems in Massachusetts to identify data-driven subtypes of MDD, characterizing sociodemographic features, comorbid diagnoses, and treatment patterns. We applied Latent Dirichlet Allocation (LDA) to summarize diagnostic codes followed by agglomerative clustering to define patient subgroups. Results Among 136,371 patients (95,034 women [70 %]; 41,337 men [30 %]; mean [SD] age, 47.0 [14.0] years), the 15 putative MDD subtypes were characterized by comorbidities and distinct patterns in medication use. There was substantial variation in rates of selective serotonin reuptake inhibitor (SSRI) use (from a low of 62 % to a high of 78 %) and selective norepinephrine reuptake inhibitor (SNRI) use (from 4 % to 21 %). Limitations Electronic health records lack reliable symptom-level data, so we cannot examine the extent to which subtypes might differ in clinical presentation or symptom dimensions. Conclusion These data-driven subtypes, drawing on representative clinical cohorts, merit further investigation for their utility in identifying more homogeneous patient populations for basic as well as clinical investigation.
psychiatry,clinical neurology
What problem does this paper attempt to address?