Markov Chain of Thought for Efficient Mathematical Reasoning

Wen Yang,Kai Fan,Minpeng Liao
2024-10-23
Abstract:Chain of Thought (CoT) of multi-step benefits from the logical structure of the reasoning steps and task-specific actions, significantly enhancing the mathematical reasoning capabilities of large language models. As the prevalence of long CoT, the number of reasoning steps exceeds manageable token limits and leads to higher computational demands. Inspired by the fundamental logic of human cognition, ``derive, then reduce'', we conceptualize the standard multi-step CoT as a novel Markov Chain of Thought (MCoT). In this study, we consider the mathematical reasoning task, defining each reasoning step as text accompanied by a Python code snippet. To facilitate a longer reasoning path, self-correction is enabled through interactions with the code interpreter. Our MCoT aims to compress previous reasoning steps into a simplified question, enabling efficient next-step inference without relying on a lengthy KV cache. In our experiments, we curate the \texttt{MCoTInstruct} dataset, and the empirical results indicate that MCoT not only significantly enhances efficiency but also maintains comparable accuracy. While much remains to be explored, this work paves the way for exploring the long CoT reasoning abilities of LLMs.
Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the efficiency and accuracy issues faced by large - language models (LLMs) when performing complex mathematical reasoning. Specifically, although the existing multi - step reasoning methods (Multi - step Reasoning, MSR) have improved the reasoning ability, as the number of reasoning steps increases, problems such as excessive consumption of computing resources, prolonged reasoning time, and cumulative errors will occur. These problems make multi - step reasoning inefficient in practical applications. To solve the above problems, the author proposes a new reasoning framework - Markov Chain of Thought (MCoT). MCoT decomposes the complex reasoning process into a series of simplified sub - problems and uses the properties of Markov chains to model the transformation relationships between these sub - problems, thereby achieving efficient reasoning. Its core idea is inspired by "derive and then simplify" in human cognition, ensuring that each reasoning step only depends on the current state and not on previous historical information. This not only reduces the requirements for memory and computing resources but also improves the speed and accuracy of reasoning. ### Main contributions 1. **Propose an innovative framework**: Use the characteristics of Markov chains to view the reasoning process as a sequence of transitions between states. 2. **Construct the MCoTInstruct dataset**: A dataset specifically designed for mathematical reasoning tasks to promote the development of the research community. 3. **Experimental verification**: Extensive experiments show that in the case of up to 8 reasoning steps, MCoT is 1.9 times faster than traditional multi - step reasoning and maintains higher accuracy. 4. **Explore advanced reasoning abilities**: Provide a new way to explore more advanced reasoning abilities and will release model checkpoints and code repositories upon acceptance. ### Specific methods - **Markov Chain of Thought Reasoning**: Assume that each successful derivation step can gradually simplify the original problem into a series of simpler problems and finally obtain the answer. By defining the probability distribution of generating new problems and using the Markov property, ensure the memory - less nature of the reasoning process. - **MCoTInstruct dataset construction**: Extract seed data from the existing multi - step reasoning datasets and expand the dataset through the self - distillation method to improve data coverage and diversity. ### Experimental results - **Accuracy**: The MCoT model performs better than other open - source mathematical solution models on multiple datasets. In particular, on the MATH dataset, MCoT - DeepSeek achieves an accuracy rate of 55.8%, exceeding all 34B and 70B models. - **Efficiency**: Compared with MSR, MCoT significantly improves the reasoning efficiency. Especially when there are more reasoning steps, the average GPU cache usage and decoding time of MCoT are significantly reduced. In summary, this paper effectively solves the efficiency and accuracy bottlenecks encountered by existing reasoning methods when dealing with complex mathematical problems by introducing the MCoT framework, providing a new direction for future research.