Adversarial Math Word Problem Generation

Roy Xie,Chengxuan Huang,Junlin Wang,Bhuwan Dhingra
2024-06-16
Abstract:Large language models (LLMs) have significantly transformed the educational landscape. As current plagiarism detection tools struggle to keep pace with LLMs' rapid advancements, the educational community faces the challenge of assessing students' true problem-solving abilities in the presence of LLMs. In this work, we explore a new paradigm for ensuring fair evaluation -- generating adversarial examples which preserve the structure and difficulty of the original questions aimed for assessment, but are unsolvable by LLMs. Focusing on the domain of math word problems, we leverage abstract syntax trees to structurally generate adversarial examples that cause LLMs to produce incorrect answers by simply editing the numeric values in the problems. We conduct experiments on various open- and closed-source LLMs, quantitatively and qualitatively demonstrating that our method significantly degrades their math problem-solving ability. We identify shared vulnerabilities among LLMs and propose a cost-effective approach to attack high-cost models. Additionally, we conduct automatic analysis to investigate the cause of failure, providing further insights into the limitations of LLMs.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the fairness issue of large language models (LLMs) in educational assessment, particularly in solving Math Word Problems (MWPs). With the significant advancements in LLMs' natural language generation and problem-solving capabilities, students can use these tools to complete assignments, posing a challenge for educators to accurately assess students' actual problem-solving abilities. To tackle this issue, the paper proposes a new paradigm of ensuring fair assessment by generating adversarial examples. Specifically, the research focuses on the domain of math word problems, generating adversarial examples by modifying the numerical values in the problems. These examples retain the original problem's structure and difficulty but make it impossible for LLMs to solve them correctly. The main contributions of the paper include: 1. **Proposed Method**: Transforming MWPs into Python code and then using abstract syntax trees (AST) for structured modifications to generate adversarial examples in a controlled manner. 2. **Educational Constraints**: Defining a set of constraints to ensure that the generated problems remain logical and educationally valuable, such as maintaining the positivity or negativity of numbers, integer properties, and appropriate fraction ranges. 3. **Generation Methods**: Proposing three different generation methods (M1, M2, M3) to control the difficulty level of the generated problems. Among them, M3 is the strictest generation method, ensuring that the adversarial examples are closest to the original problems in terms of difficulty and coherence. 4. **Experimental Results**: Conducting experiments on various open-source and closed-source LLMs, the results show that even problems generated by the strictest constraint method can significantly reduce the problem-solving accuracy of LLMs. Additionally, the paper compares the effects of different generation methods and analyzes the generality and transferability of the adversarial examples. In summary, the goal of this paper is to test and reveal the limitations of large language models in solving specific types of adversarial math word problems, thereby providing a new tool for educational assessment.