Abstract:e13628 Background: ChatGPT is a conversational artificial intelligence (AI) model that learns from massive text-based datasets and then responds to user input, which often involves completing tasks or answering questions. Recent studies showed ChatGPT’s success in passing multiple specialty medical licensing and board examinations, showcasing its promising capabilities in the medical domain. Here, we investigated ChatGPT's potential as a swift and reliable information source for medical oncologists using board examination style questions and real patient cases. Methods: We randomly selected 121 board-style questions from the American Society of Clinical Oncology Self-Evaluation Program (ASCO SEP). The questions were entered into ChatGPT in both multiple-choice (MC) and open-ended (OE) prompts. ChatGPT’s answers and explanations were evaluated for accuracy and concordance. Non-inferiority analysis was performed with power of 80% at α = 0.05 and non-inferiority margin set at 70% correct answers given the historical board exam pass rate of about 65% correct answers. For subgroup analysis, the questions were categorized by tested competency and primary tumor pathology. ChatGPT was also given 10 questions derived from real patient cases. We compared its responses to the answers provided by experienced oncologists to determine accuracy and practical applicability. Results: ChatGPT answered 75 (62.0%) MC queries correctly. Among the correctly answered queries, 2 responses contained faulty explanations. Such inaccurate or discordant explanations were found in 26 of the 46 incorrectly answered queries. In OE prompts, ChatGPT answered 53 (43.8%) questions correctly with correct explanations for all. Of the 68 incorrect responses, 32 of them contained inaccurate or discordant explanations. Subgroup analysis suggested varying performance across the categories. The best performance was seen with malignant hematology (81.8% of MC and 72.8% of OE prompts answered correctly) while the weakest performance was seen with genitourinary malignancies (60% of MC and 20% of OE prompts answered correctly). As for the real-world patient case questions, responses from ChatGPT and the clinicians were concordant in 5 questions. None of the discordant responses contained inaccurate information while 80% of the concordant responses contained sufficient details to assist with patient management decisions. Conclusions: ChatGPT's performance fell short of the non-inferiority margin, highlighting the challenges with incorporating AI in the rapidly evolving field of medical oncology. Despite the limitations, ChatGPT’s partial success, in both board-style and real-world patient care questions, affirms its potential for clinical utility in future.

ChatGPT Solving Complex Kidney Transplant Cases: A Comparative Study With Human Respondents

ChatGPT and Artificial Intelligence in Transplantation Research: Is It Always Correct?

Evaluating ChatGPT's Accuracy in Responding to Patient Education Questions on Acute Kidney Injury and Continuous Renal Replacement Therapy

Exploring the ability of ChatGPT to create quality patient education resources about kidney transplant

ChatGPT v4 outperforming v3.5 on cancer treatment recommendations in quality, clinical guideline, and expert opinion concordance

Availability of ChatGPT to provide medical information for patients with kidney cancer

The potential of ChatGPT in medicine: an example analysis of nephrology specialty exams in Poland

The Accuracy of Artificial Intelligence ChatGPT in Oncology Examination Questions

Using ChatGPT for Kidney Transplantation: Perceived Information Quality by Race and Education Levels

Assessing ChatGPT's potential as a clinical resource for medical oncologists: An evaluation with board-style questions and real-world patient cases.

Digital health tools in nephrology: A comparative analysis of AI and professional opinions via online polls

Evaluating the Performance of ChatGPT in Urology: A Comparative Study of Knowledge Interpretation and Patient Guidance

Performance of ChatGPT on American Board of Surgery In-Training Examination Preparation Questions

AI-driven translations for kidney transplant equity in Hispanic populations

Performance of GPT-4 Vision on kidney pathology exam questions

Assessing ChatGPT's Responses to Otolaryngology Patient Questions

Evaluating Performance of ChatGPT on MKSAP Cardiology Board Review Questions

Performance of ChatGPT on Solving Orthopedic Board-Style Questions: A Comparative Analysis of ChatGPT 3.5 and ChatGPT 4

Evaluation of ChatGPT pathology knowledge using board-style questions

Evaluating the performance of ChatGPT in clinical pharmacy: A comparative study of ChatGPT and clinical pharmacists

Performance of ChatGPT-3.5 and ChatGPT-4 on the European Board of Urology (EBU) exams: a comparative analysis