Do ChatGPT and Google Differ in Answers to Commonly Asked Patient Questions Regarding Total Shoulder and Total Elbow Arthroplasty?
Shebin Tharakan,Brandon Klein,Luke Bartlett,Aaron Atlas,Stephen A. Parada,Randy M. Cohn
DOI: https://doi.org/10.1016/j.jse.2023.11.014
IF: 3.507
2024-01-05
Journal of Shoulder and Elbow Surgery
Abstract:Background Artificial intelligence (AI) and large language models (LLM) offer a new potential resource for patient education. The answers by ChatGPT, a LLM AI text bot, to frequently asked questions (FAQs) were compared to answers provided by a contemporary Google search to determine the reliability of information provided by these sources for patient education in upper extremity arthroplasty. Methods "Total shoulder arthroplasty" (TSA) and "total elbow arthroplasty" (TEA) were entered into Google Search and ChatGPT 3.0 to determine the ten most frequently asked questions (FAQs). On Google, the FAQs were obtained through the "people also ask" section, while ChatGPT was asked to provide the ten most FAQs. Each question, answer, and reference(s) cited were recorded. A modified version of the Rothwell system was used to categorize questions into 10 subtopics: special activities, timeline of recovery, restrictions, technical details, cost, indications/management, risks and complications, pain, longevity, and evaluation of surgery. Each reference was categorized into the following groups: commercial, academic, medical practice, single surgeon personal, or social media. Questions for TSA and TEA were combined for analysis and compared between Google and ChatGPT with a two sample Z-test for proportions. Results Overall, most questions were related to procedural indications or management (17.5%). There were no significant differences between Google and ChatGPT between question categories. The majority of references were from academic websites (65%). ChatGPT produced a greater number of academic references compared to Google (80% vs 50%; p =0.047), while Google more commonly provided medical practice references (25% vs 0%; p =0.017). Conclusion In conjunction with patient-physician discussions, AI LLMs may provide a reliable resource for patients. By providing information based on academic references, these tools have the potential to improve health literacy and improved shared decision making for patients searching for information about TSA and TEA. Clinical Significance With the rising prevalence of AI programs, it is essential to understand how these applications affect patient education in medicine.
surgery,orthopedics,sport sciences