ChatGPT 3.5 fails to write appropriate multiple choice practice exam questions

Alexander Ngo,Saumya Gupta,Oliver Perrine,Rithik Reddy,Sherry Ershadi,Daniel Remick
DOI: https://doi.org/10.1016/j.acpath.2023.100099
2024-01-01
Academic Pathology
Abstract:Artificial intelligence (AI) may have a profound impact on traditional teaching in academic settings. Multiple concerns have been raised, especially related to using ChatGPT for creating de novo essays. However, AI programs such as ChatGPT may augment teaching techniques. In this article, we used ChatGPT 3.5 to create 60 multiple choice questions. Author written text was uploaded and ChatGPT asked to create multiple choice questions with an explanation for the correct answer and explanations for the incorrect answers. Unfortunately, ChatGPT only generated correct questions and answers with explanations in 32 % of the questions (19 out of 60). In many instances, ChatGPT failed to provide an explanation for the incorrect answers. An additional 25 % of the questions had answers that were either wrong or misleading. A grade of 32 % would be considered failing in most courses. Despite these issues, instructors may still find ChatGPT useful for creating practice exams with explanations—with the caveat that extensive editing may be required.
What problem does this paper attempt to address?