Abstract:Background: Large language models (LLMs) are computational artificial intelligence systems with advanced natural language processing capabilities that have recently been popularized among health care students and educators due to their ability to provide real-time access to a vast amount of medical knowledge. The adoption of LLM technology into medical education and training has varied, and little empirical evidence exists to support its use in clinical teaching environments. Objective: The aim of the study is to identify and qualitatively evaluate potential use cases and limitations of LLM technology for real-time ward-based educational contexts. Methods: A brief, single-site exploratory evaluation of the publicly available ChatGPT-3.5 (OpenAI) was conducted by implementing the tool into the daily attending rounds of a general internal medicine inpatient service at a large urban academic medical center. ChatGPT was integrated into rounds via both structured and organic use, using the web-based "chatbot" style interface to interact with the LLM through conversational free-text and discrete queries. A qualitative approach using phenomenological inquiry was used to identify key insights related to the use of ChatGPT through analysis of ChatGPT conversation logs and associated shorthand notes from the clinical sessions. Results: Identified use cases for ChatGPT integration included addressing medical knowledge gaps through discrete medical knowledge inquiries, building differential diagnoses and engaging dual-process thinking, challenging medical axioms, using cognitive aids to support acute care decision-making, and improving complex care management by facilitating conversations with subspecialties. Potential additional uses included engaging in difficult conversations with patients, exploring ethical challenges and general medical ethics teaching, personal continuing medical education resources, developing ward-based teaching tools, supporting and automating clinical documentation, and supporting productivity and task management. LLM biases, misinformation, ethics, and health equity were identified as areas of concern and potential limitations to clinical and training use. A code of conduct on ethical and appropriate use was also developed to guide team usage on the wards. Conclusions: Overall, ChatGPT offers a novel tool to enhance ward-based learning through rapid information querying, second-order content exploration, and engaged team discussion regarding generated responses. More research is needed to fully understand contexts for educational use, particularly regarding the risks and limitations of the tool in clinical settings and its impacts on trainee development.

ChatGPT-o1 and the Pitfalls of Familiar Reasoning in Medical Ethics

ChatGPT as a Tool for Medical Education and Clinical Decision-Making on the Wards: Case Study

OpenAI o1-Preview vs. ChatGPT in Healthcare: A New Frontier in Medical AI Reasoning

The ethics of ChatGPT in medicine and healthcare: a systematic review on Large Language Models (LLMs)

What Should ChatGPT Mean for Bioethics?

ChatGPT and Beyond: An overview of the growing field of large language models and their use in ophthalmology

Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios

Using ChatGPT in the Development of Clinical Reasoning Cases: A Qualitative Study

Evaluation of ChatGPT Family of Models for Biomedical Reasoning and Classification

Large language models in medical ethics: useful but not expert

The Role of Large Language Models in Medical Education: Applications and Implications

Large Language Models Like ChatGPT Show Promise, but Clinical Use of Artificial Intelligence Requires Physician Partnership to Enable Patient Care, Minimize Administrative Burden, Maximize Efficiency, and Minimize Risk

Advancing Medical Practice with Artificial Intelligence: ChatGPT in Healthcare

Invited Commentary on ChatGPT: What Every Pediatric Surgeon Should Know About Its Potential Uses and Pitfalls

AI as a Medical Ally: Evaluating ChatGPT's Usage and Impact in Indian Healthcare

Exploring the potential utility of AI large language models for medical ethics: an expert panel evaluation of GPT-4

The Intersection of ChatGPT, Clinical Medicine, and Medical Education

Potential applications and implications of large language models in primary care

Assessing the performance of ChatGPT in bioethics: a large language model's moral compass in medicine

Assessing the performance of ChatGPT in bioethics: a large language model’s moral compass in medicine