Efficient Tool Use with Chain-of-Abstraction Reasoning

Silin Gao,Jane Dwivedi-Yu,Ping Yu,Xiaoqing Ellen Tan,Ramakanth Pasunuru,Olga Golovneva,Koustuv Sinha,Asli Celikyilmaz,Antoine Bosselut,Tianlu Wang

2024-02-27

Abstract:To achieve faithful reasoning that aligns with human expectations, large language models (LLMs) need to ground their reasoning to real-world knowledge (e.g., web facts, math and physical rules). Tools help LLMs access this external knowledge, but there remains challenges for fine-tuning LLM agents (e.g., Toolformer) to invoke tools in multi-step reasoning problems, where inter-connected tool calls require holistic and efficient tool usage planning.

Computation and Language

What problem does this paper attempt to address?

This paper mainly discusses the problem of effectively utilizing tools in multi-step reasoning with large-scale language models (LLMs). Existing tool-augmented LLMs face challenges in handling multi-step reasoning tasks that require multiple tool invocations, such as inefficient tool usage planning leading to inaccurate reasoning and inefficient reasoning latency due to waiting for API responses. To address this problem, the paper proposes a new approach called "Chain-of-Abstraction" (CoA). This approach trains LLMs to first generate a chain of reasoning with abstract placeholders and then calls domain-specific tools to fill in concrete knowledge for the reasoning chain. The planning of this abstract chain enables LLMs to learn more generalizable reasoning strategies and be more robust to variations in domain knowledge involved in different reasoning problems. Additionally, this approach allows LLMs to perform decoding and external tool invocations in parallel, avoiding delays caused by waiting for tool responses and improving reasoning speed. Experiments demonstrate that the CoA method outperforms previous chain-based reasoning and tool-augmented baselines on mathematical reasoning and Wikipedia-based question answering tasks, with an average increase of approximately 6% in answer accuracy and an average speedup of approximately 1.4 times in reasoning speed. Through human evaluation, the CoA method also guides LLMs to learn more accurate reasoning, reducing reasoning errors by approximately 8%. In summary, this paper aims to improve the tool usage of LLMs in multi-step reasoning through chain-of-abstraction reasoning, enhancing their reasoning accuracy and efficiency.

Efficient Tool Use with Chain-of-Abstraction Reasoning

Concise and Organized Perception Facilitates Large Language Models for Deductive Reasoning.

CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models

Chain of Tools: Large Language Model is an Automatic Multi-tool Learner

When Do Program-of-Thought Works for Reasoning?

Break the Chain: Large Language Models Can be Shortcut Reasoners

Art: Automatic multi-step reasoning and tool-use for large language models

Optimizing Chain-of-Thought Reasoning: Tackling Arranging Bottleneck via Plan Augmentation

Concise and Organized Perception Facilitates Reasoning in Large Language Models

Large Language Models as Tool Makers

Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation

CLR-Fact: Evaluating the Complex Logical Reasoning Capability of Large Language Models over Factual Knowledge

AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning

Meta-Reasoning Improves Tool Use in Large Language Models

Multi-tool Integration Application for Math Reasoning Using Large Language Model

REL: Working out is all you need

On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models

Towards Faithful Chain-of-Thought: Large Language Models are Bridging Reasoners

ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models

On the Empirical Complexity of Reasoning and Planning in LLMs