Reliability of artificial intelligence chatbot responses to frequently asked questions in breast surgical oncology

Estefania Roldan‐Vasquez,Samir Mitri,Shreya Bhasin,Tina Bharani,Kathryn Capasso,Michelle Haslinger,Ranjna Sharma,Ted A. James
DOI: https://doi.org/10.1002/jso.27715
2024-06-06
Journal of Surgical Oncology
Abstract:Introduction Artificial intelligence (AI)‐driven chatbots, capable of simulating human‐like conversations, are becoming more prevalent in healthcare. While this technology offers potential benefits in patient engagement and information accessibility, it raises concerns about potential misuse, misinformation, inaccuracies, and ethical challenges. Methods This study evaluated a publicly available AI chatbot, ChatGPT, in its responses to nine questions related to breast cancer surgery selected from the American Society of Breast Surgeons' frequently asked questions (FAQ) patient education website. Four breast surgical oncologists assessed the responses for accuracy and reliability using a five‐point Likert scale and the Patient Education Materials Assessment (PEMAT) Tool. Results The average reliability score for ChatGPT in answering breast cancer surgery questions was 3.98 out of 5.00. Surgeons unanimously found the responses understandable and actionable per the PEMAT criteria. The consensus found ChatGPT's overall performance was appropriate, with minor or no inaccuracies. Conclusion ChatGPT demonstrates good reliability in responding to breast cancer surgery queries, with minor, nonharmful inaccuracies. Its answers are accurate, clear, and easy to comprehend. Notably, ChatGPT acknowledged its informational role and did not attempt to replace medical advice or discourage users from seeking input from a healthcare professional.
oncology,surgery
What problem does this paper attempt to address?