Stepwise Self-Consistent Mathematical Reasoning with Large Language Models

Zilong Zhao,Yao Rong,Dongyang Guo,Emek Gözlüklü,Emir Gülboy,Enkelejda Kasneci
2024-02-24
Abstract:Using Large Language Models for complex mathematical reasoning is difficult, primarily due to the complexity of multi-step reasoning. The main challenges of this process include (1) selecting critical intermediate results to advance the procedure, and (2) limited exploration of potential solutions. To address these issues, we introduce a novel algorithm, namely Stepwise Self-Consistent Chain-of-Thought (SSC-CoT). SSC-CoT employs a strategy of selecting intermediate steps based on the intersection of various reasoning chains. Additionally, SSC-CoT enables the model to discover critical intermediate steps by querying a knowledge graph comprising relevant domain knowledge. To validate SSC-CoT, we present a new dataset, TriMaster100, tailored for complex trigonometry problems. This dataset contains 100 questions, with each solution broken down into scored intermediate steps, facilitating a comprehensive evaluation of the mathematical reasoning process. On TriMaster100, SSC-CoT triples the effectiveness of the state-of-the-art methods. Furthermore, we benchmark SSC-CoT on the widely recognized complex mathematical question dataset, MATH level 5, and it surpasses the second-best method by 7.2% in accuracy. Code and the TriMaster100 dataset can be found at:
Artificial Intelligence,Computation and Language,Machine Learning
What problem does this paper attempt to address?
The paper primarily addresses the challenges faced by large language models (LLMs) in solving complex mathematical problems, particularly the difficulties in multi-step reasoning. The authors propose a new algorithm called Stepwise Self-Consistent Chain-of-Thought (SSC-CoT) to enhance the ability of LLMs to handle such problems. Specifically, the paper aims to solve the following issues: 1. **Identifying key intermediate results**: For complex mathematical problems, existing methods often struggle to determine which intermediate results are crucial for solving the problem. 2. **Exploring the limitations of potential solutions**: Existing methods may not fully explore all possible solution paths. To address these challenges, SSC-CoT adopts a strategy based on selecting intermediate steps from the intersection of multiple reasoning chains and discovers key intermediate steps by querying a knowledge graph containing relevant domain knowledge. Additionally, the paper introduces a new dataset, TriMaster100, specifically designed to evaluate the ability to solve complex trigonometry problems. This dataset includes 100 problems and their decomposed scored intermediate steps, facilitating a comprehensive assessment of the mathematical reasoning process. Experimental results on TriMaster100 and another widely recognized complex mathematical problem dataset, MATH level 5, show that SSC-CoT significantly outperforms other state-of-the-art methods, particularly in identifying and utilizing key intermediate steps. In summary, the main contributions of this paper include: - Proposing a new multi-step reasoning algorithm, SSC-CoT, for solving complex mathematical problems. - Designing a knowledge graph query mechanism to help LLMs retrieve information more effectively. - Constructing a new dataset, TriMaster100, for evaluating intermediate results in solving highly complex mathematical problems. - Benchmarking SSC-CoT on the TriMaster100 and MATH datasets, demonstrating its superior performance in solving complex problems.