"My Grade is Wrong!": A Contestable AI Framework for Interactive Feedback in Evaluating Student Essays

Shengxin Hong,Chang Cai,Sixuan Du,Haiyue Feng,Siyuan Liu,Xiuyi Fan

2024-09-12

Abstract:Interactive feedback, where feedback flows in both directions between teacher and student, is more effective than traditional one-way feedback. However, it is often too time-consuming for widespread use in educational practice. While Large Language Models (LLMs) have potential for automating feedback, they struggle with reasoning and interaction in an interactive setting. This paper introduces CAELF, a Contestable AI Empowered LLM Framework for automating interactive feedback. CAELF allows students to query, challenge, and clarify their feedback by integrating a multi-agent system with computational argumentation. Essays are first assessed by multiple Teaching-Assistant Agents (TA Agents), and then a Teacher Agent aggregates the evaluations through formal reasoning to generate feedback and grades. Students can further engage with the feedback to refine their understanding. A case study on 500 critical thinking essays with user studies demonstrates that CAELF significantly improves interactive feedback, enhancing the reasoning and interaction capabilities of LLMs. This approach offers a promising solution to overcoming the time and resource barriers that have limited the adoption of interactive feedback in educational settings.

Artificial Intelligence,Human-Computer Interaction

What problem does this paper attempt to address?

The problem this paper attempts to address is: how to provide effective interactive feedback in educational settings to overcome the time and resource limitations of traditional one-way feedback. Specifically, the paper proposes a framework called CAELF (Contestable AI Empowered LLM Framework), which aims to achieve automated interactive feedback through a multi-agent system and computational argumentation techniques. This framework allows students to query, challenge, and clarify feedback, thereby enhancing the effectiveness and interactivity of the feedback. The main objectives include: 1. **Improving the effectiveness of interactive feedback**: Traditional interactive feedback, while effective, is time-consuming and resource-intensive, making it less applicable in actual teaching. CAELF makes interactive feedback more efficient and practical through automation. 2. **Enhancing the reasoning and interaction capabilities of large language models (LLMs)**: Existing LLMs have issues with inaccurate reasoning and insufficient interaction capabilities when providing interactive feedback. CAELF improves the performance of LLMs in interactive feedback through computational argumentation in a multi-agent system. 3. **Promoting reflective learning among students**: Through interaction with CAELF, students can gain a deeper understanding of their essays and improve their writing skills through questioning and discussion. The paper demonstrates the effectiveness of CAELF in evaluating 500 critical thinking essays through a case study, proving the significant advantages of the framework in terms of initial scoring accuracy and interactive scoring accuracy.

"My Grade is Wrong!": A Contestable AI Framework for Interactive Feedback in Evaluating Student Essays

Human-AI Collaborative Essay Scoring: A Dual-Process Framework with LLMs

Enhancing LLM-Based Feedback: Insights from Intelligent Tutoring Systems and the Learning Sciences

Using Generative AI and Multi-Agents to Provide Automatic Feedback

Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback

LLF-Bench: Benchmark for Interactive Learning from Language Feedback

A LLM-Powered Automatic Grading Framework with Human-Level Guidelines Optimization

ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate

Automated Essay Feedback Generation and Its Impact on Revision

Generative AI as a Tool for Enhancing Reflective Learning in Students

ELion: An Intelligent Chinese Composition Tutoring System Based on Large Language Models

Long-term intrathecal S(+)-ketamine in a patient with cancer-related neuropathic pain.

Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback Generation

Towards Applying Powerful Large AI Models in Classroom Teaching: Opportunities, Challenges and Prospects

A Multi-Strategy Computer-Assisted EFL Writing Learning System With Deep Learning Incorporated and Its Effects on Learning: A Writing Feedback Perspective

Large Language Models as Partners in Student Essay Evaluation

Students' Perceptions and Preferences of Generative Artificial Intelligence Feedback for Programming

Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses

PapagAI:Automated Feedback for Reflective Essays

The Responsible Development of Automated Student Feedback with Generative AI