The Application of LLMs for Radiologic Decision-Making

Hossam A. Zaki,Andrew Aoun,Saminah Munshi,Hazem Abdel-Megid,Lleayem Nazario-Johnson,Sun Ho Ahn
DOI: https://doi.org/10.1016/j.jacr.2024.01.007
IF: 6.24
2024-01-15
Journal of the American College of Radiology
Abstract:Background and Purpose Large Language Models (LLMs) have seen explosive growth, but their potential role in medical applications remains underexplored. Our study investigates the capability of LLMs to predict the most appropriate imaging study for specific clinical presentations in various subspecialty areas in radiology. Methods and Materials ChatGPT (GPT-4) by Open AI and Glass AI by Glass Health were tested on 1075 clinical scenarios from 11 ACR expert panels to determine the most appropriate imaging study, benchmarked against the ACR Appropriateness Criteria. Two responses per clinical presentation were generated and averaged for the final clinical presentation score. Clinical presentation scores for each topic area were averaged as its final score. The average of the topic scores within a panel determined the final score of each panel. LLM responses were on a scale of 0 to 3. Partial scores were given for non-specific answers. Pearson correlation coefficient (R-value) was calculated for each panel to determine a context-specific performance. Results Glass AI scored significantly higher than ChatGPT (2.32 +/- 0.67 vs 2.08 +/- 0.74, p=0.002). Both LLMs performed the best in the Polytrauma, Breast, and Vascular panels, and performed the worst in the Neurologic, Musculoskeletal, and Cardiac panels. Glass AI outperformed ChatGPT in 10/11 panels, except OB/GYN. Maximum agreement was in the Pediatrics, Neurologic, and Thoracic panels, while the most disagreement occurred in the Vascular, Breast, and Urologic panels. Conclusion LLMs can be used to predict imaging studies, with GlassAI's superior performance indicating the benefits of extra medical-text training. This supports the potential of LLMs in radiologic decision-making.
radiology, nuclear medicine & medical imaging
What problem does this paper attempt to address?