ChatGPT performance on radiation technologist and therapist entry to practice exams

Ryan Duggan,Kaitlyn M Tsuruda
DOI: https://doi.org/10.1016/j.jmir.2024.04.019
2024-05-25
Abstract:Background: The aim of this study was to describe the proficiency of ChatGPT (GPT-4) on certification style exams from the Canadian Association of Medical Radiation Technologists (CAMRT), and describe its performance across multiple exam attempts. Methods: ChatGPT was prompted with questions from CAMRT practice exams in the disciplines of radiological technology, magnetic resonance (MRI), nuclear medicine and radiation therapy (87-98 questions each). ChatGPT attempted each exam five times. Exam performance was evaluated using descriptive statistics, stratified by discipline and question type (knowledge, application, critical thinking). Light's Kappa was used to assess agreement in answers across attempts. Results: Using a passing grade of 65 %, ChatGPT passed the radiological technology exam only once (20 %), MRI all five times (100 %), nuclear medicine three times (60 %), and radiation therapy all five times (100 %). ChatGPT's performance was best on knowledge questions across all disciplines except radiation therapy. It performed worst on critical thinking questions. Agreement in ChatGPT's responses across attempts was substantial within the disciplines of radiological technology, MRI, and nuclear medicine, and almost perfect for radiation therapy. Conclusion: ChatGPT (GPT-4) was able to pass certification style exams for radiation technologists and therapists, but its performance varied between disciplines. The algorithm demonstrated substantial to almost perfect agreement in the responses it provided across multiple exam attempts. Future research evaluating ChatGPT's performance on standardized tests should consider using repeated measures.
What problem does this paper attempt to address?