Abstract:This study investigates the use of generative AI and multi-agent systems to provide automatic feedback in educational contexts, particularly for student constructed responses in science assessments. The research addresses a key gap in the field by exploring how multi-agent systems, called AutoFeedback, can improve the quality of GenAI-generated feedback, overcoming known issues such as over-praise and over-inference that are common in single-agent large language models (LLMs). The study developed a multi-agent system consisting of two AI agents: one for generating feedback and another for validating and refining it. The system was tested on a dataset of 240 student responses, and its performance was compared to that of a single-agent LLM. Results showed that AutoFeedback significantly reduced the occurrence of over-praise and over-inference errors, providing more accurate and pedagogically sound feedback. The findings suggest that multi-agent systems can offer a more reliable solution for generating automated feedback in educational settings, highlighting their potential for scalable and personalized learning support. These results have important implications for educators and researchers seeking to leverage AI in formative assessments, offering a pathway to more effective feedback mechanisms that enhance student learning outcomes.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper aims to solve the problem of using Generative AI (GenAI) and Multi - Agent Systems (MAS) to provide automatic feedback in an educational environment. Specifically, the research focuses on the following two key issues: 1. **What is the frequency of over - praise and over - inference in the feedback generated by a single Generative AI agent?** - Over - praise refers to feedback that is too positive, even if the student's answer is incorrect or meaningless. - Over - inference refers to conclusions in the feedback that do not match the student's actual performance, resulting in a mismatch between the feedback and the student's work. 2. **How much improvement does the Multi - Agent System (AutoFeedback) have in reducing over - praise and over - inference?** ### Research background Automatic feedback plays an important role in supporting personalized learning, especially in online learning environments and classroom formative assessments. However, the existing automatic feedback generated by Generative AI has some problems, such as over - praise and over - inference, which may mislead students, reduce learning motivation, and affect learning outcomes. To overcome these limitations, researchers have developed a Multi - Agent System (AutoFeedback), which consists of two AI agents: one for generating feedback and the other for verifying and optimizing the feedback. ### Methods 1. **Data set** : - The data set comes from an existing study, which requires middle school students to construct and describe models of scientific phenomena to test whether they meet the NGSS performance expectation MS - PS1 - 4. The task includes observing the dissolution process of chocolate candies in water at different temperatures and explaining the phenomenon. - The responses of 845 students were collected and divided into "beginners" and "proficients" according to the scoring criteria. To test the automatic feedback generation, the responses of 120 "beginners" and 120 "proficients" were randomly selected to form a balanced data set. 2. **Experimental setup** : - Single - agent system: Use Agent 1 to generate initial feedback, simulating the feedback generation process of the traditional LLM system. - Multi - agent system (AutoFeedback): After Agent 1 generates the initial feedback, Agent 2 verifies and optimizes it, identifying and correcting the problems of over - praise and over - inference. ### Results - The experimental results show that the Multi - Agent System (AutoFeedback) significantly reduces the incidence of over - praise and over - inference, providing more accurate and educationally more valuable feedback. - Compared with the single - agent system, the Multi - Agent System shows better performance in generating high - quality automatic feedback. ### Significance - The research shows that the Multi - Agent System can provide a more reliable solution for generating automated feedback in an educational environment, which is helpful for large - scale and personalized learning support. - These results are of great significance to educators and researchers, providing a new way to use AI to provide more effective feedback mechanisms in formative assessments, thereby improving students' learning outcomes. ### Keywords - Generative Artificial Intelligence - Multi - agents - Automatic Feedback - Large Language Model (LLM) ### Summary This paper solves the problems of over - praise and over - inference in the automatic feedback generated by Generative AI by developing and testing the Multi - Agent System (AutoFeedback), providing a new solution for automatic feedback generation in the education field.

Using Generative AI and Multi-Agents to Provide Automatic Feedback

The Responsible Development of Automated Student Feedback with Generative AI

Personalised Feedback Framework for Online Education Programmes Using Generative AI

AI-generated feedback on writing: insights into efficacy and ENL student preference

Generative Grading: Near Human-level Accuracy for Automated Feedback on Richly Structured Problems

Students' Perceptions and Preferences of Generative Artificial Intelligence Feedback for Programming

Transforming learning experiences and assessments through AI‐empowered cocreation of quality feedback

Generative AI as a Tool for Enhancing Reflective Learning in Students

"My Grade is Wrong!": A Contestable AI Framework for Interactive Feedback in Evaluating Student Essays

Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation

A LLM-Powered Automatic Grading Framework with Human-Level Guidelines Optimization

Personalized Multimodal Feedback Generation in Education

Using Generative AI to Promote Psychological, Feedback, and Artificial Intelligence Literacies in Undergraduate Psychology

Towards Applying Powerful Large AI Models in Classroom Teaching: Opportunities, Challenges and Prospects

Personalized Feedback in Massive Open Online Courses: Harnessing the Power of LangChain and OpenAI API

Enhancing LLM-Based Feedback: Insights from Intelligent Tutoring Systems and the Learning Sciences

Exploring the integration and utilisation of generative AI in formative e-assessments: A case study in higher education

AI-Press: A Multi-Agent News Generating and Feedback Simulation System Powered by Large Language Models

PapagAI:Automated Feedback for Reflective Essays

Improving the Validity of Automatically Generated Feedback via Reinforcement Learning

Improving Writing Feedback for Struggling Writers: Generative AI to the Rescue?