ChatGPT Solving Complex Kidney Transplant Cases: A Comparative Study With Human Respondents

Michal A. Mankowski,Ian S. Jaffe,Jingzhi Xu,Sunjae Bae,Eric K. Oermann,Yindalon Aphinyanaphongs,Mara A. McAdams‐DeMarco,Bonnie E. Lonze,Babak J. Orandi,Darren Stewart,Macey Levan,Allan Massie,Sommer Gentry,Dorry L. Segev
DOI: https://doi.org/10.1111/ctr.15466
2024-09-29
Clinical Transplantation
Abstract:Introduction ChatGPT has shown the ability to answer clinical questions in general medicine but may be constrained by the specialized nature of kidney transplantation. Thus, it is important to explore how ChatGPT can be used in kidney transplantation and how its knowledge compares to human respondents. Methods We prompted ChatGPT versions 3.5, 4, and 4 Visual (4 V) with 12 multiple‐choice questions related to six kidney transplant cases from 2013 to 2015 American Society of Nephrology (ASN) fellowship program quizzes. We compared the performance of ChatGPT with US nephrology fellowship program directors, nephrology fellows, and the audience of the ASN's annual Kidney Week meeting. Results Overall, ChatGPT 4 V correctly answered 10 out of 12 questions, showing a performance level comparable to nephrology fellows (group majority correctly answered 9 of 12 questions) and training program directors (11 of 12). This surpassed ChatGPT 4 (7 of 12 correct) and 3.5 (5 of 12). All three ChatGPT versions failed to correctly answer questions where the consensus among human respondents was low. Conclusion Each iterative version of ChatGPT performed better than the prior version, with version 4 V achieving performance on par with nephrology fellows and training program directors. While it shows promise in understanding and answering kidney transplantation questions, ChatGPT should be seen as a complementary tool to human expertise rather than a replacement.
surgery,transplantation
What problem does this paper attempt to address?