Associations Between Natural Language Processing-Enriched Social Determinants of Health and Suicide Death Among US Veterans

Avijit Mitra,Richeek Pradhan,Rachel D Melamed,Kun Chen,David C Hoaglin,Katherine L Tucker,Joel I Reisman,Zhichao Yang,Weisong Liu,Jack Tsai,Hong Yu
DOI: https://doi.org/10.1001/jamanetworkopen.2023.3079
2023-03-01
Abstract:Importance: Social determinants of health (SDOHs) are known to be associated with increased risk of suicidal behaviors, but few studies use SDOHs from unstructured electronic health record notes. Objective: To investigate associations between veterans' death by suicide and recent SDOHs, identified using structured and unstructured data. Design, setting, and participants: This nested case-control study included veterans who received care under the US Veterans Health Administration from October 1, 2010, to September 30, 2015. A natural language processing (NLP) system was developed to extract SDOHs from unstructured clinical notes. Structured data yielded 6 SDOHs (ie, social or familial problems, employment or financial problems, housing instability, legal problems, violence, and nonspecific psychosocial needs), NLP on unstructured data yielded 8 SDOHs (social isolation, job or financial insecurity, housing instability, legal problems, barriers to care, violence, transition of care, and food insecurity), and combining them yielded 9 SDOHs. Data were analyzed in May 2022. Exposures: Occurrence of SDOHs over a maximum span of 2 years compared with no occurrence of SDOH. Main outcomes and measures: Cases of suicide death were matched with 4 controls on birth year, cohort entry date, sex, and duration of follow-up. Suicide was ascertained by National Death Index, and patients were followed up for up to 2 years after cohort entry with a study end date of September 30, 2015. Adjusted odds ratios (aORs) and 95% CIs were estimated using conditional logistic regression. Results: Of 6 122 785 veterans, 8821 committed suicide during 23 725 382 person-years of follow-up (incidence rate 37.18 per 100 000 person-years). These 8821 veterans were matched with 35 284 control participants. The cohort was mostly male (42 540 [96.45%]) and White (34 930 [79.20%]), with 6227 (14.12%) Black veterans. The mean (SD) age was 58.64 (17.41) years. Across the 5 common SDOHs, NLP-extracted SDOH, on average, retained 49.92% of structured SDOHs and covered 80.03% of all SDOH occurrences. SDOHs, obtained by structured data and/or NLP, were significantly associated with increased risk of suicide. The 3 SDOHs with the largest effect sizes were legal problems (aOR, 2.66; 95% CI, 2.46-2.89), violence (aOR, 2.12; 95% CI, 1.98-2.27), and nonspecific psychosocial needs (aOR, 2.07; 95% CI, 1.92-2.23), when obtained by combining structured data and NLP. Conclusions and relevance: In this study, NLP-extracted SDOHs, with and without structured SDOHs, were associated with increased risk of suicide among veterans, suggesting the potential utility of NLP in public health studies.
What problem does this paper attempt to address?