Abstract:Using Large Language Models for complex mathematical reasoning is difficult, primarily due to the complexity of multi-step reasoning. The main challenges of this process include (1) selecting critical intermediate results to advance the procedure, and (2) limited exploration of potential solutions. To address these issues, we introduce a novel algorithm, namely Stepwise Self-Consistent Chain-of-Thought (SSC-CoT). SSC-CoT employs a strategy of selecting intermediate steps based on the intersection of various reasoning chains. Additionally, SSC-CoT enables the model to discover critical intermediate steps by querying a knowledge graph comprising relevant domain knowledge. To validate SSC-CoT, we present a new dataset, TriMaster100, tailored for complex trigonometry problems. This dataset contains 100 questions, with each solution broken down into scored intermediate steps, facilitating a comprehensive evaluation of the mathematical reasoning process. On TriMaster100, SSC-CoT triples the effectiveness of the state-of-the-art methods. Furthermore, we benchmark SSC-CoT on the widely recognized complex mathematical question dataset, MATH level 5, and it surpasses the second-best method by 7.2% in accuracy. Code and the TriMaster100 dataset can be found at:

What problem does this paper attempt to address?

The paper primarily addresses the challenges faced by large language models (LLMs) in solving complex mathematical problems, particularly the difficulties in multi-step reasoning. The authors propose a new algorithm called Stepwise Self-Consistent Chain-of-Thought (SSC-CoT) to enhance the ability of LLMs to handle such problems. Specifically, the paper aims to solve the following issues: 1. **Identifying key intermediate results**: For complex mathematical problems, existing methods often struggle to determine which intermediate results are crucial for solving the problem. 2. **Exploring the limitations of potential solutions**: Existing methods may not fully explore all possible solution paths. To address these challenges, SSC-CoT adopts a strategy based on selecting intermediate steps from the intersection of multiple reasoning chains and discovers key intermediate steps by querying a knowledge graph containing relevant domain knowledge. Additionally, the paper introduces a new dataset, TriMaster100, specifically designed to evaluate the ability to solve complex trigonometry problems. This dataset includes 100 problems and their decomposed scored intermediate steps, facilitating a comprehensive assessment of the mathematical reasoning process. Experimental results on TriMaster100 and another widely recognized complex mathematical problem dataset, MATH level 5, show that SSC-CoT significantly outperforms other state-of-the-art methods, particularly in identifying and utilizing key intermediate steps. In summary, the main contributions of this paper include: - Proposing a new multi-step reasoning algorithm, SSC-CoT, for solving complex mathematical problems. - Designing a knowledge graph query mechanism to help LLMs retrieve information more effectively. - Constructing a new dataset, TriMaster100, for evaluating intermediate results in solving highly complex mathematical problems. - Benchmarking SSC-CoT on the TriMaster100 and MATH datasets, demonstrating its superior performance in solving complex problems.

Stepwise Self-Consistent Mathematical Reasoning with Large Language Models

Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning

Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning

Learning Multi-Step Reasoning by Solving Arithmetic Tasks

Logic Contrastive Reasoning with Lightweight Large Language Model for Math Word Problems

Large Language Models for Mathematical Reasoning: Progresses and Challenges

Key-Point-Driven Mathematical Reasoning Distillation of Large Language Model

Multi-tool Integration Application for Math Reasoning Using Large Language Model

Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From A Psychological Perspective

MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning

Improve Mathematical Reasoning in Language Models by Automated Process Supervision

SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models

MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline

Benchmarking Large Language Models for Math Reasoning Tasks

Enhancing Mathematical Reasoning in LLMs by Stepwise Correction

Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions

Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

SAAS: Solving Ability Amplification Strategy for Enhanced Mathematical Reasoning in Large Language Models

Reasoning in Large Language Models Through Symbolic Math Word Problems