AI-assisted Automated Short Answer Grading of Handwritten University Level Mathematics Exams

Tianyi Liu,Julia Chatain,Laura Kobel-Keller,Gerd Kortemeyer,Thomas Willwacher,Mrinmaya Sachan
2024-08-21
Abstract:Effective and timely feedback in educational assessments is essential but labor-intensive, especially for complex tasks. Recent developments in automated feedback systems, ranging from deterministic response grading to the evaluation of semi-open and open-ended essays, have been facilitated by advances in machine learning. The emergence of pre-trained Large Language Models, such as GPT-4, offers promising new opportunities for efficiently processing diverse response types with minimal customization. This study evaluates the effectiveness of a pre-trained GPT-4 model in grading semi-open handwritten responses in a university-level mathematics exam. Our findings indicate that GPT-4 provides surprisingly reliable and cost-effective initial grading, subject to subsequent human verification. Future research should focus on refining grading rules and enhancing the extraction of handwritten responses to further leverage these technologies.
History and Overview
What problem does this paper attempt to address?