Evaluating the Application of ChatGPT in Outpatient Triage Guidance: A Comparative Study

Dou Liu,Ying Han,Xiandi Wang,Xiaomei Tan,Di Liu,Guangwu Qian,Kang Li,Dan Pu,Rong Yin
2024-04-27
Abstract:The integration of Artificial Intelligence (AI) in healthcare presents a transformative potential for enhancing operational efficiency and health outcomes. Large Language Models (LLMs), such as ChatGPT, have shown their capabilities in supporting medical decision-making. Embedding LLMs in medical systems is becoming a promising trend in healthcare development. The potential of ChatGPT to address the triage problem in emergency departments has been examined, while few studies have explored its application in outpatient departments. With a focus on streamlining workflows and enhancing efficiency for outpatient triage, this study specifically aims to evaluate the consistency of responses provided by ChatGPT in outpatient guidance, including both within-version response analysis and between-version comparisons. For within-version, the results indicate that the internal response consistency for ChatGPT-4.0 is significantly higher than ChatGPT-3.5 (p=0.03) and both have a moderate consistency (71.2% for 4.0 and 59.6% for 3.5) in their top recommendation. However, the between-version consistency is relatively low (mean consistency score=1.43/3, median=1), indicating few recommendations match between the two versions. Also, only 50% top recommendations match perfectly in the comparisons. Interestingly, ChatGPT-3.5 responses are more likely to be complete than those from ChatGPT-4.0 (p=0.02), suggesting possible differences in information processing and response generation between the two versions. The findings offer insights into AI-assisted outpatient operations, while also facilitating the exploration of potentials and limitations of LLMs in healthcare utilization. Future research may focus on carefully optimizing LLMs and AI integration in healthcare systems based on ergonomic and human factors principles, precisely aligning with the specific needs of effective outpatient triage.
Computation and Language,Artificial Intelligence,Human-Computer Interaction
What problem does this paper attempt to address?
The paper primarily explores the application and performance consistency of ChatGPT in outpatient triage guidance. Specifically, the study aims to evaluate the response consistency of ChatGPT in Chinese outpatient guidance, including response analysis within the same version (such as ChatGPT-3.5 and ChatGPT-4.0) and comparisons between different versions. The study found that the response consistency of ChatGPT-4.0 within the same version was significantly higher than that of ChatGPT-3.5 (p=0.03), but the response consistency between the two versions was relatively low, indicating certain differences between different versions. Additionally, the study pointed out that although ChatGPT has the potential to improve work efficiency in outpatient triage, further optimization is needed in practical applications to ensure accuracy and reliability. Future research can focus on how to better integrate large language models (LLMs) and artificial intelligence technologies into the healthcare system to meet the specific needs of efficient outpatient triage.