This is a quiz Premise Input: A Key to Unlocking Higher Diagnostic Accuracy in Large Language Models

Yusuke Asari,Ryo Kurokawa,Yuki Sonoda,Akifumi Hagiwara,Jun Kamohara,Takahiro Fukushima,Wataru Gonoi,Osamu Abe
DOI: https://doi.org/10.1101/2024.09.20.24314101
2024-09-23
Abstract:Purpose Large language models (LLMs) are neural network models trained on vast amounts of textual data, showing promising performance in various fields. In radiology, studies have demonstrated the strong performance of LLMs in diagnostic imaging quiz cases. However, the inherent differences of prior probabilities of a final diagnosis between clinical and quiz cases pose challenges for LLMs, as LLMs had not been informed about the quiz nature in previous literature, while human physicians can optimize the diagnosis, consciously or unconsciously, depending on the situation. The present study aimed to test the hypothesis that notifying LLMs about the quiz nature might improve diagnostic accuracy. Methods One-hundred-and-fifty consecutive cases from the "Case of the Week" radiological diagnostic quiz case series on the American Journal of Neuroradiology website were analyzed. GPT-4o and Claude 3.5 Sonnet were used to generate top three differential diagnoses based on the textual clinical history and figure legends. The prompts included or excluded information about the quiz nature for both models. Two radiologists evaluated the accuracy of the diagnoses. McNemar test assessed differences in correct response rates. Results Informing the quiz nature improved the diagnostic performance of both models. Specifically, primary diagnosis of Claude 3.5 Sonnet and top 3 differential diagnoses of GPT-4o significantly improved when the quiz nature was informed. Conclusion Informing the quiz nature of cases significantly enhances the diagnostic performances of LLMs. This insight into LLMs capabilities could inform future research and applications, highlighting the importance of context in optimizing LLM-based diagnostics.
What problem does this paper attempt to address?