Can ChatGPT pass the thoracic surgery exam?

Adem Gencer,Suphi Aydin
DOI: https://doi.org/10.1016/j.amjms.2023.08.001
2023-08-08
Abstract:Background The capacity of ChatGPT in academic environments and medical exams is being discovered more and more every day. In this study, we tested the success of ChatGPT on Turkish-language thoracic surgery exam questions. Methods ChatGPT was provided with a total of 105 questions divided into seven distinct groups, each of which contained 15 questions. Along with the success of the students, the success of ChatGPT-3.5 and ChatGPT-4 architectures in answering the questions correctly was analyzed. Results The overall mean score of students was 12.50 ±1.20, corresponding to 83.33%. Moreover, ChatGPT-3.5 managed to surpass students' score of 12.5 with an average of 13.57 ±0.49 questions correctly on average, while ChatGPT-4 answered 14 ±0.76 questions correctly (83.3%, 90.48%, and 93.33%, respectively). Conclusions When the results of this study and other similar studies in the literature are evaluated together, ChatGPT, which was developed for general purpose, can also produce successful results in a specific field of medicine. AI-powered applications are becoming more and more useful and valuable in providing academic knowledge.
medicine, general & internal
What problem does this paper attempt to address?