Abstract:Abstract Background ChatGPT is a large language model based chatbot created by OpenAI. Since its release, ChatGPT has gained widespread attention among the healthcare community regarding its potential utility as a medical practice tool. In addition, ChatGPT can be adapted to serve as a clinical-decision support tool. In this study we explored the potential of ChatGPT as a decision support tool for acute Ulcerative Colitis (UC) presentations in the setting of the emergency department (ED). Methods Our investigation centered around 20 distinct acute UC presentations to the ED, accumulated over two years. Case summaries - embodying crucial data points such as symptoms, vital signs, and laboratory results - were processed by ChatGPT. For each case, we asked ChatGPT to assess disease severity based on the TrueLove and Witts classification, substituting erythrocyte sedimentation rate ≥30 with C-Reactive protein ≥12. Furthermore, it was to recommend hospitalization or outpatient care for each case based on the disease severity. The answers were compared with assessments made by our department's gastroenterologists and the actual decision made by the physician in the ED. Results Overall, ChatGPT categorized 12, 7 and 1 patient with severe, moderate and mild disease, respectively. For each case, ChatGPT supplied a detailed answer depicting severity of every variable of the criteria and an overall severity classification (table 1). Compared to our gastroenterologists’ assessments, ChatGPT graded 16/20 (80%) of the patients with the same severity. A high degree of reliability was found between the two assessments as the average measure intra-class correlation coefficient of absolute agreement was 0.839 (95% confidence interval 0.588-0.937, F= 5.95, p<0.001). Inconsistencies in four cases stemmed primarily from inaccurate cut-off values for systemic variables. Following severity assessment, ChatGPT leaned towards hospitalization for 16 out of 18 (88.9%) patients. For two moderate UC cases, however, it could not provide a decisive recommendation. Comparatively, only 12 out of the 20 patients were hospitalized in actual clinical practice. Conclusion In this unique study, findings suggest that Chat-GPT, has potential as a clinical decision-support tool in assessing UC severity and recommending suitable settings for further treatment. While this concept warrants further investigation and validation, its ability to evaluate a clinical scenario based on established criteria could greatly benefit the field of Inflammatory bowel disease and gastroenterology.

Evaluating the role of large language models in inflammatory bowel disease patient information

P717 Evaluating the performance of Large Language Models in responding to patients' health queries: A comparative analysis with medical experts

Uncovering Language Disparity of ChatGPT in Healthcare: Non-English Clinical Environment for Retinal Vascular Disease Classification (Preprint)

Evaluating the use of large language model in identifying top research questions in gastroenterology

Large Language Models for Efficient Medical Information Extraction

May ChatGPT be a tool producing medical information for common inflammatory bowel disease patients’ questions? An evidence-controlled analysis

Evaluation of ChatGPT Family of Models for Biomedical Reasoning and Classification

Large language models: a primer and gastroenterology applications

Large language model answers medical questions about standard pathology reports

Systematic review: The use of large language models as medical chatbots in digestive diseases

On the limitations of large language models in clinical diagnosis

P389 Comparative Evaluation of ChatGPT and Human Specialists in the Application of ECCO Guidelines for the Management of Inflammatory Bowel Diseases and Malignancies: A Proof-of-Concept Study

Quality of Answers of Generative Large Language Models vs Peer Patients for Interpreting Lab Test Results for Lay Patients: Evaluation Study

Comparative evaluation of a language model and human specialists in the application of European guidelines for the management of inflammatory bowel diseases and malignancies

Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study

P467 Towards AI-Augmented Clinical Decision Making: An Examination of ChatGPT's Utility in Acute Ulcerative Colitis Presentations

Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review

Effectiveness of ChatGPT in explaining complex medical reports to patients

The Role of Large Language Models in Medical Education: Applications and Implications

The utility of ChatGPT as a generative medical translator

Digesting Digital Health: A Study of Appropriateness and Readability of ChatGPT-Generated Gastroenterological Information