MathDivide: Improved mathematical reasoning by large language models

Saksham Sahai Srivastava,Ashutosh Gandhi

2024-05-13

Abstract:Large language models have been proven to be capable of handling complex linguistic and cognitive tasks. Therefore their usage has been extended to tasks requiring logical reasoning ability such as Mathematics. In this paper, we propose a prompting technique called MathDivide that breaks down the mathematical problem into simpler subproblems. Each of the subproblems is formulated as an algebraic expression whose value is evaluated by the Python code generated by the LLM for the corresponding algebraic expression. The values fed to the Python code are the numerical values provided in the problem statement. The solutions for the subproblems are composed together to obtain the final answer for the problem statement. Finally, the final answer is compared to the correct answer. If the final answer matches the correct answer, it is produced as output else a refinement prompt is fed to the LLM. We experiment with this prompting technique on both closed-source LLM models and open-source LLM models using GSM8K dataset. The results obtained demonstrate that MathDivide was able to significantly outperform the leading prompting technique called Math-prompter.

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

The problem this paper attempts to address is how to improve the performance of large language models (LLMs) in mathematical reasoning tasks. Specifically, the authors propose a prompting technique called MathDivide, which enhances the mathematical reasoning ability of LLMs by breaking down complex mathematical problems into simpler sub-problems. Each sub-problem is represented as an algebraic expression and numerically computed through generated Python code. Ultimately, the solutions to these sub-problems are combined to obtain the final answer to the original problem. If the final answer does not match the correct answer, the system provides a refined prompt to the LLM, indicating which sub-problems' solutions are incorrect, thereby guiding the LLM to gradually improve its solution. The main contributions of the paper are: 1. **Problem Decomposition**: Breaking down complex problems into simple sub-problems helps LLMs better understand and solve mathematical problems. 2. **Algebraic Expressions and Python Code**: Ensuring computational accuracy by generating algebraic expressions and corresponding Python code. 3. **Refined Prompts**: Using refined prompts based on human feedback to help LLMs identify and correct errors, improving their accuracy in mathematical reasoning tasks. Through these methods, MathDivide significantly improves the performance of LLMs in mathematical reasoning tasks, surpassing the existing leading technique Mathprompter.

MathDivide: Improved mathematical reasoning by large language models

MathPrompter: Mathematical Reasoning using Large Language Models

Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models

Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems

INC-Math: Integrating Natural Language and Code for Enhanced Mathematical Reasoning in Large Language Models

Key-Point-Driven Mathematical Reasoning Distillation of Large Language Model

Large Language Models for Mathematical Reasoning: Progresses and Challenges

Prompting with Divide-and-Conquer Program Makes Large Language Models Discerning to Hallucination and Deception

Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on Large Language Models

Reasoning in Large Language Models Through Symbolic Math Word Problems

DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning

Teaching-Inspired Integrated Prompting Framework: A Novel Approach for Enhancing Reasoning in Large Language Models

SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models

Multilingual Mathematical Reasoning: Advancing Open-Source LLMs in Hindi and English

Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering

Can Language Models Rival Mathematics Students? Evaluating Mathematical Reasoning through Textual Manipulation and Human Experiments

Benchmarking Large Language Models for Math Reasoning Tasks

Evaluating Mathematical Reasoning Beyond Accuracy

From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting

Mathify: Evaluating Large Language Models on Mathematical Problem Solving Tasks