Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models

Anchun Gui,Jian Li,Yong Dai,Nan Du,Han Xiao

2024-08-28

Abstract:Tool-augmented large language models (LLMs) are attracting widespread attention when accessing up-to-date knowledge and alleviating hallucination issues. Nowadays, advanced closed-source LLMs (e.g., ChatGPT) have demonstrated surprising tool-usage capabilities through prompting and in-context learning techniques. To empower the capabilities of open-source LLMs (e.g., LLaMA) in manipulating tools, current efforts focus on either template-driven or token-triggered tool-usage. However, the former hampers LLMs' flexibility to address diverse user's queries due to constrained tool interactions, while the latter limits the generalizability when engaging with new tools, since tool-usage learning is based on task- and tool-specific datasets. To alleviate these concerns, in this paper, we propose a decision-aware and generalizable tool-usage framework (DEER). Specifically, we first construct the tool-usage samples with multiple decision branches via an automatic generation pipeline, thereby inspiring the decision-making awareness of LLMs under diverse scenarios. Meanwhile, we propose a novel tool sampling strategy to enhance the generalizability of LLMs over unseen tools. Extensive experiments demonstrate that our proposed DEER is effective and significantly outperforms baselines across various datasets.

Computation and Language

What problem does this paper attempt to address?

The problems that this paper attempts to solve mainly focus on two aspects: 1. **Lack of tool - use decision - making awareness**: Current methods, whether template - driven or token - triggered tool - use methods, lack flexible decision - making capabilities when handling different queries. For example, when dealing with some ordinary queries (such as "Give me 5 suggestions for staying happy"), large - language models (LLMs) should be able to answer directly with their own knowledge instead of relying on external tools every time. Moreover, when there are no suitable APIs available, LLMs should also be able to stop calling tools to avoid ineffective tool use and slow inference processes. 2. **Insufficient generalization ability for unseen tools**: For token - triggered tool - use methods, such as Toolformer, these methods are trained on datasets of specific tasks and tools, and thus cannot be directly applied to new tools. This limits the adaptability of these models when encountering new tools. To solve these problems, the paper proposes a decision - aware and generalized tool - use framework (DEER) based on open - source LLMs. Specifically, DEER enhances the flexibility of LLMs in handling different queries by designing multiple decision branches and proposes a novel tool - sampling strategy to improve the generalization ability of the model on unseen tools. These improvements make DEER significantly outperform existing baseline models on multiple datasets.

Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models

MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

Towards Tool Use Alignment of Large Language Models

Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees

GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution

Towards Practical Tool Usage for Continually Learning LLMs

T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step

WTU-EVAL: A Whether-or-Not Tool Usage Evaluation Benchmark for Large Language Models

DeLLMa: Decision Making Under Uncertainty with Large Language Models

Large Language Models as Tool Makers

Chain of Tools: Large Language Model is an Automatic Multi-tool Learner

Tool Learning with Large Language Models: A Survey

TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents

MetaTool: Facilitating Large Language Models to Master Tools with Meta-task Augmentation

Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation

TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage

Towards Completeness-Oriented Tool Retrieval for Large Language Models

ToolGen: Unified Tool Retrieval and Calling via Generation

ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases