Clinical Sentiment Analysis by Large Language Models Enhances the Prediction of Hepatorenal Syndrome in Decompensated Cirrhosis

Mason Lai,Cynthia Fenton,Jessica Rubin,Chiung-Yu Huang,Mark J. Pletcher,Jennifer C Lai,Giuseppe Cullaro,Jin Ge
DOI: https://doi.org/10.1101/2024.11.13.24317220
2024-11-13
Abstract:Background and Aims Hepatorenal syndrome - Acute Kidney Injury (HRS-AKI) is a severe complication of decompensated cirrhosis that is challenging to predict. Sentiment analysis, a computational process of identifying and categorizing opinions and judgment expressed in text, may enhance traditional prediction methodologies based on structured variables. Large language models (LLMs), such as generative pre-trained transformers (GPTs), have demonstrated abilities to perform sentiment analyses on non-clinical texts. We sought to determine if GPT-performed sentiment analysis could improve upon predictions using clinical covariates alone in the prediction of HRS-AKI. Methods Adult patients admitted to a single academic medical center with decompensated cirrhosis and AKI. We used a protected health information (PHI) compliant version of Microsoft Azure OpenAI GPT-4o to derive a sentiment score ranging from 0 to 1 for HRS-AKI, and conduct natural language processing (NLP) extraction of clinical terms associated with HRS-AKI in clinical notes. The area under the receiver operator curve (AUROC) was compared in logistic regression models incorporating structured variables (socio-demographics, MELD 3.0, hemodynamic parameters) with compared to without sentiment scores and NLP-extracted clinical terms. Results In our cohort of 314 participants, higher sentiment score was associated with the diagnosis of HRS-AKI (OR 1.33 per 0.1, 95% CI 1.02-1.79) in multivariate models. AUROC of the baseline model using structured clinical covariates alone was 0.639. With the addition of the GPT-4o derived sentiment score and clinical terms to structured covariates, the final model yielded an improved AUROC of 0.758 (p= 0.03). Conclusions Clinical texts contain large amounts of data that are currently difficult to extract using standard methodologies. Sentiment analysis and NLP-based variable derivation with GPT-4o in clinical application is feasible and can improve the prediction of HRS-AKI over traditional modeling methodologies alone.
What problem does this paper attempt to address?