How critically can an AI think? A framework for evaluating the quality of thinking of generative artificial intelligence

Luke Zaphir,Jason M. Lodge,Jacinta Lisec,Dom McGrath,Hassan Khosravi
2024-06-21
Abstract:Generative AI such as those with large language models have created opportunities for innovative assessment design practices. Due to recent technological developments, there is a need to know the limits and capabilities of generative AI in terms of simulating cognitive skills. Assessing student critical thinking skills has been a feature of assessment for time immemorial, but the demands of digital assessment create unique challenges for equity, academic integrity and assessment authorship. Educators need a framework for determining their assessments vulnerability to generative AI to inform assessment design practices. This paper presents a framework that explores the capabilities of the LLM ChatGPT4 application, which is the current industry benchmark. This paper presents the Mapping of questions, AI vulnerability testing, Grading, Evaluation (MAGE) framework to methodically critique their assessments within their own disciplinary contexts. This critique will provide specific and targeted indications of their questions vulnerabilities in terms of the critical thinking skills. This can go on to form the basis of assessment design for their tasks.
Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is how to evaluate the capability of generative artificial intelligence (such as large language models) in simulating cognitive skills, particularly in the domain of critical thinking. With the development of generative AI technology, it poses new challenges to the design, fairness, and academic integrity of educational assessments. Educators need a framework to determine the extent to which their assessment tasks may be influenced by generative AI, in order to guide assessment design practices. This paper proposes a framework called MAGE (Mapping, AI Vulnerability Testing, Grading, Evaluation), which aims to systematically critique assessment tasks from the perspective of critical thinking, providing specific instructions on the vulnerability of questions in terms of critical thinking skills. This can serve as a foundation for assessment design, helping educators better understand the capabilities of generative AI, thereby designing assessment tasks that delve deeper into students' understanding and analytical abilities, rather than merely information recall, which is susceptible to AI assistance.