Use and Application of Large Language Models for Patient Questions following Total Knee Arthroplasty
Sandeep S. Bains,Jeremy A. Dubin,Daniel Hameed,Oliver C. Sax,Scott Douglas,Michael Mont,James Nace,Ronald E. Delanois
DOI: https://doi.org/10.1016/j.arth.2024.03.017
IF: 4.435
2024-03-15
The Journal of Arthroplasty
Abstract:Introduction A consumer-focused health care model not only allows unprecedented access to information, but equally warrants consideration of the appropriateness of providing accurate patient health information. Nurses play a large role in influencing patient satisfaction following total knee arthroplasty (TKA), but they come at a cost. A specific natural language artificial intelligence (AI) model, ChatGPT (Generative Pretrained Transformer), has accumulated over 100 million users within months of launching. As such, we aimed to compare: 1) orthopaedic surgeons' evaluation of the appropriateness of the answers to the most frequently asked patient questions after TKA; and 2) patients' comfort level in answering their postoperative questions by using answers provided by arthroplasty-trained nurses and ChatGPT. Methods We prospectively created 60 questions based on the most commonly asked patient questions following TKA. There were three fellowship-trained surgeons who assessed the answers provided by arthroplasty-trained nurses and ChatGPT-4 to each of the questions. The surgeons graded each set of responses based on clinical judgment as: 1) "appropriate," 2) "inappropriate" if the response contained inappropriate information, or 3) "unreliable," if the responses provided inconsistent content. Patients' comfort level and trust in AI were assessed using research electronic data capture (REDCap) hosted at our local hospital. Results The surgeons graded 44 out of 60 (73.3%) responses for the arthroplasty-trained nurses and 44 out of 60 (73.3%) for ChatGPT to be "appropriate." There were four responses graded "inappropriate" and one response graded "unreliable" provided by the nurses. For the ChatGPT response, there were five responses graded "inappropriate" and no responses graded "unreliable." There were 136 patients (53.8%) who were more comfortable with the answers provided by ChatGPT compared to 86 patients (34.0%) who preferred the answers from arthroplasty-trained nurses. Of the 253 patients, 233 (92.1%) were uncertain if they would trust AI to answer their postoperative questions. There were 127 patients (50.2%) who answered that if they knew the previous answer was provided by ChatGPT, their comfort level in trusting the answer would change. Conclusion One potential use of ChatGPT can be found in providing appropriate answers to patient questions after TKA. At our institution, cost expenditures can potentially be minimized while maintaining patient satisfaction. Inevitably, successful implementation is dependent on the ability to provide information that is credible and in accordance with the objectives of both physicians and patients.
orthopedics
What problem does this paper attempt to address?