Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS

Jinyang Wu,Mingkuan Feng,Shuai Zhang,Feihu Che,Zengqi Wen,Jianhua Tao

2024-11-28

Abstract:In-context Learning (ICL) enables large language models (LLMs) to tackle downstream tasks through sophisticated prompting and high-quality demonstrations. However, this traditional ICL paradigm shows limitations when facing complex mathematical reasoning tasks, primarily due to its heavy dependence on example quality and the necessity for human intervention in challenging scenarios. To address these limitations, this paper presents HiAR-ICL, a \textbf{Hi}gh-level \textbf{A}utomated \textbf{R}easoning paradigm in \textbf{ICL} that shifts focus from specific examples to abstract thinking patterns, extending the conventional concept of context in ICL. HiAR-ICL introduces five atomic reasoning actions as fundamental components for constructing chain-structured patterns. Using Monte Carlo Tree Search, we explore reasoning paths and construct thought cards to guide subsequent inference. We then develop a cognitive complexity framework that dynamically matches problems with appropriate thought cards. Experimental results demonstrate HiAR-ICL's effectiveness, achieving state-of-the-art accuracy (79.6$\%$) on the MATH benchmark with Qwen2.5-7B-Instruct, surpassing GPT-4o (76.6$\%$) and Claude 3.5 (71.1$\%$).

Computation and Language

What problem does this paper attempt to address?

The problem that this paper attempts to solve is some limitations of the existing In - Context Learning (ICL) methods in complex mathematical reasoning tasks. Specifically, these problems include: 1. **Dependence on Example Quality**: The effectiveness of traditional ICL methods is highly dependent on the quality of the provided examples. If the examples are not of high quality or not representative, the reasoning ability of the model will be severely affected. 2. **Requirement for Manual Intervention**: When dealing with complex reasoning tasks, it is often necessary to carefully design high - quality examples manually, which is both time - consuming and labor - intensive. 3. **Limited Generalization Ability**: When encountering reasoning tasks with similar logical structures but different manifestations, traditional ICL methods need to reconstruct the corresponding examples, which limits their generalization ability. To overcome these limitations, the paper proposes HiAR - ICL (High - Order Automated Reasoning Paradigm), which extends the traditional ICL concept through Monte Carlo Tree Search (MCTS), shifting from specific examples to abstract thinking patterns. This method aims to "teach the model how to think, rather than just what to think", thereby improving the performance of the model in complex reasoning tasks.

Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS

When Do Program-of-Thought Works for Reasoning?

GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning

Teaching Algorithmic Reasoning via In-context Learning

When Do Program-of-Thoughts Work for Reasoning?

Multi-tool Integration Application for Math Reasoning Using Large Language Model

CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge

AI Reasoning Systems: PAC and Applied Methods

Fast Counterfactual Inference for History-Based Reinforcement Learning.

Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks

Counterfactual Collaborative Reasoning

Eliminating Reasoning via Inferring with Planning: A New Framework to Guide LLMs' Non-linear Thinking

Learning Robust Rule Representations for Abstract Reasoning via Internal Inferences

From Heuristic to Analytic: Cognitively Motivated Strategies for Coherent Physical Commonsense Reasoning

Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search

Interpretable Contrastive Monte Carlo Tree Search Reasoning

Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data

ART: Automatic multi-step reasoning and tool-use for large language models

ARES: Alternating Reinforcement Learning and Supervised Fine-Tuning for Enhanced Multi-Modal Chain-of-Thought Reasoning Through Diverse AI Feedback

CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity

Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning