Large language models to identify social determinants of health in electronic health records

Marco Guevara,Shan Chen,Spencer Thomas,Tafadzwa L. Chaunzwa,Idalid Franco,Benjamin H. Kann,Shalini Moningi,Jack M. Qian,Madeleine Goldstein,Susan Harper,Hugo J. W. L. Aerts,Paul J. Catalano,Guergana K. Savova,Raymond H. Mak,Danielle S. Bitterman
DOI: https://doi.org/10.1038/s41746-023-00970-0
IF: 15.2
2024-01-12
npj Digital Medicine
Abstract:Social determinants of health (SDoH) play a critical role in patient outcomes, yet their documentation is often missing or incomplete in the structured data of electronic health records (EHRs). Large language models (LLMs) could enable high-throughput extraction of SDoH from the EHR to support research and clinical care. However, class imbalance and data limitations present challenges for this sparsely documented yet critical information. Here, we investigated the optimal methods for using LLMs to extract six SDoH categories from narrative text in the EHR: employment, housing, transportation, parental status, relationship, and social support. The best-performing models were fine-tuned Flan-T5 XL for any SDoH mentions (macro-F1 0.71), and Flan-T5 XXL for adverse SDoH mentions (macro-F1 0.70). Adding LLM-generated synthetic data to training varied across models and architecture, but improved the performance of smaller Flan-T5 models (delta F1 + 0.12 to +0.23). Our best-fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models in the zero- and few-shot setting, except GPT4 with 10-shot prompting for adverse SDoH. Fine-tuned models were less likely than ChatGPT to change their prediction when race/ethnicity and gender descriptors were added to the text, suggesting less algorithmic bias ( p < 0.05). Our models identified 93.8% of patients with adverse SDoH, while ICD-10 codes captured 2.0%. These results demonstrate the potential of LLMs in improving real-world evidence on SDoH and assisting in identifying patients who could benefit from resource support.
health care sciences & services,medical informatics
What problem does this paper attempt to address?
The paper aims to solve the problem of incomplete or missing documentation of social determinants of health (SDoH) in electronic health records (EHRs). Specifically, the paper explores how to use large language models (LLMs) to efficiently extract six key SDoH categories from the narrative text in EHRs: employment status, housing problems, transportation problems, parent - child status, relationship status, and social support. Through this method, the research hopes to improve the actual data collection of SDoH, support research and clinical care, and identify patients who may need resource support. ### Problems the paper attempts to solve 1. **Insufficient SDoH documentation**: In the existing structured data of EHRs, the information recording of SDoH is often incomplete or missing, which hinders the development of research and clinical care. 2. **Class imbalance problem**: SDoH information is sparsely recorded in EHRs, resulting in a serious class imbalance problem in the training data. 3. **Algorithmic bias**: Existing SDoH information extraction methods may perform differently among different patient groups and may even reproduce social biases. Therefore, these biases need to be evaluated and reduced. ### Methods and results - **Model selection and optimization**: The research used large language models such as Flan - T5 XL and Flan - T5 XXL, and improved the extraction performance of SDoH information by fine - tuning these models. - **Synthetic data generation**: The research enhanced the training set by generating synthetic data, especially for categories with scarce data, such as housing problems. - **Performance evaluation**: The best model achieved a macro - F1 score of 0.71 on the task of extracting any SDoH mentions and a macro - F1 score of 0.70 on the task of extracting adverse SDoH mentions. - **Bias evaluation**: The research also evaluated the performance of the model under different demographic characteristics and found that the fine - tuned model was less affected by demographic characteristics than the non - fine - tuned model, showing less algorithmic bias. ### Conclusions - **Technical contributions**: The research shows the potential of large language models in improving the actual data collection of SDoH, especially when dealing with sparsely recorded information. - **Clinical applications**: By extracting SDoH information from EHRs, patients in need of social work and resource support can be better identified, thereby improving the quality and equity of healthcare. - **Future directions**: The research suggests further exploring methods of synthetic data generation to improve the performance of the model on real - clinical texts and reduce algorithmic bias. Through these methods, the research not only improves the extraction accuracy of SDoH information but also provides a valuable reference for future clinical natural language processing tasks.