A Benchmark for Math Misconceptions: Bridging Gaps in Middle School Algebra with AI-Supported Instruction

Otero Nancy,Druga Stefania,Lan Andrew
2024-12-05
Abstract:This study introduces an evaluation benchmark for middle school algebra to be used in artificial intelligence(AI) based educational platforms. The goal is to support the design of AI systems that can enhance learner conceptual understanding of algebra by taking into account their current level of algebra comprehension. The data set comprises 55 misconceptions about algebra, common errors, and 220 diagnostic examples identified in previous peer-reviewed studies. We provide an example application using a large language model, observing a range of precision and recall scores depending on the topic and experimental setup that reaches 83.9% when including educator feedback and restricting it by topic. We found that topics such as ratios and proportions prove as difficult for LLMs as they are for students. We included a human assessment of LLMs results and feedback from five middle school math educators on the clarity and occurrence of misconceptions in the dataset and the potential use of AI in conjunction with the dataset. Most educators (80% or more) indicated that they encounter these misconceptions among their students, suggesting the relevance of the data set to teaching middle school algebra. Despite varying familiarity with AI tools, four out of five educators expressed interest in using the data set with AI to diagnose student misconceptions or train teachers. The results emphasize the importance of topic-constrained testing, the need for multimodal approaches, and the relevance of human expertise to gain practical insights when using AI for human learning.
Human-Computer Interaction,Computers and Society
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of identifying and diagnosing common misconceptions among students in secondary school algebra education. Specifically, the authors hope to create a benchmark dataset and combine artificial intelligence (AI) technology to help teachers better understand and address students' algebraic misconceptions, thereby enhancing teaching effectiveness. The following are the main objectives of the paper: 1. **Create a comprehensive algebraic misconception benchmark**: - The paper constructs a dataset containing 55 common algebraic misconceptions and 220 diagnostic examples. These misconceptions and examples are extracted from 145 peer - reviewed research literatures, covering algebra content from grade 4 to grade 8. 2. **Evaluate the accuracy of AI systems in identifying misconceptions**: - Use large - language models (LLMs) such as GPT - 4 to evaluate their performance in identifying these misconceptions. Through experiments, the researchers found that GPT - 4's performance on some topics is similar to that of students, especially having difficulties in proportion and ratio problems. Overall, GPT - 4 has an accuracy rate of 83.9%. 3. **Obtain teacher feedback**: - The researchers invited five secondary school mathematics teachers to evaluate the misconceptions in the dataset to ensure that these misconceptions do exist in actual teaching and are representative. Most teachers (more than 80%) said that they often encounter these misconceptions in the classroom and are interested in using AI tools to assist teaching. ### Main research questions The core research question of the paper is: **How can AI solutions be used to help teachers identify students' misconceptions?** To answer this question, the author proposes a three - part solution: 1. Create a comprehensive benchmark covering algebraic misconceptions from grade 4 to grade 8. 2. Evaluate the accuracy of large - language models (LLMs) in diagnosing these 55 misconceptions. 3. Obtain feedback from algebra teachers on the potential uses of the benchmark and its evaluation results. ### Research background - **Educational inequality issues**: Low - income and ethnic minority students face insufficient resources in math learning, leading to an expanding gap in math performance. - **Limitations of existing technologies**: Although technologies such as computer - aided instruction, adaptive tutoring systems, and massive open online courses (MOOCs) are widely used in math education, they often exacerbate existing socioeconomic differences. - **Importance of misconceptions**: Research shows that identifying and correcting students' math misconceptions can significantly improve their academic performance and participation. ### Conclusion By creating a dataset of real - student interactions and combining AI technology, this study provides a valuable resource for researchers, technology developers, and educators, which is helpful for improving math learning outcomes and shows the potential of large - language models in supporting math education.