Assessing ChatGPT's Responses to Otolaryngology Patient Questions

Jonathan M. Carnino,William R. Pellegrini,Megan Willis,Michael B. Cohen,Marianella Paz-Lansberg,Elizabeth M. Davis,Gregory A. Grillone,Jessica R. Levi
DOI: https://doi.org/10.1177/00034894241249621
2024-04-28
Annals of Otology Rhinology & Laryngology
Abstract:Annals of Otology, Rhinology &Laryngology, Ahead of Print. Objective:This study aims to evaluate ChatGPT's performance in addressing real-world otolaryngology patient questions, focusing on accuracy, comprehensiveness, and patient safety, to assess its suitability for integration into healthcare.Methods:A cross-sectional study was conducted using patient questions from the public online forum Reddit's r/AskDocs, where medical advice is sought from healthcare professionals. Patient questions were input into ChatGPT (GPT-3.5), and responses were reviewed by 5 board-certified otolaryngologists. The evaluation criteria included difficulty, accuracy, comprehensiveness, and bedside manner/empathy. Statistical analysis explored the relationship between patient question characteristics and ChatGPT response scores. Potentially dangerous responses were also identified.Results:Patient questions averaged 224.93 words, while ChatGPT responses were longer at 414.93 words. The accuracy scores for ChatGPT responses were 3.76/5, comprehensiveness scores were 3.59/5, and bedside manner/empathy scores were 4.28/5. Longer patient questions did not correlate with higher response ratings. However, longer ChatGPT responses scored higher in bedside manner/empathy. Higher question difficulty correlated with lower comprehensiveness. Five responses were flagged as potentially dangerous.Conclusion:While ChatGPT exhibits promise in addressing otolaryngology patient questions, this study demonstrates its limitations, particularly in accuracy and comprehensiveness. The identification of potentially dangerous responses underscores the need for a cautious approach to AI in medical advice. Responsible integration of AI into healthcare necessitates thorough assessments of model performance and ethical considerations for patient safety.
otorhinolaryngology
What problem does this paper attempt to address?