Chain of Tools: Large Language Model is an Automatic Multi-tool Learner

Zhengliang Shi,Shen Gao,Xiuyi Chen,Yue Feng,Lingyong Yan,Haibo Shi,Dawei Yin,Zhumin Chen,Suzan Verberne,Zhaochun Ren
DOI: https://doi.org/10.48550/arXiv.2405.16533
2024-05-26
Abstract:Augmenting large language models (LLMs) with external tools has emerged as a promising approach to extend their utility, empowering them to solve practical tasks. Existing work typically empowers LLMs as tool users with a manually designed workflow, where the LLM plans a series of tools in a step-by-step manner, and sequentially executes each tool to obtain intermediate results until deriving the final answer. However, they suffer from two challenges in realistic scenarios: (1) The handcrafted control flow is often ad-hoc and constraints the LLM to local planning; (2) The LLM is instructed to use only manually demonstrated tools or well-trained Python functions, which limits its generalization to new tools. In this work, we first propose Automatic Tool Chain (ATC), a framework that enables the LLM to act as a multi-tool user, which directly utilizes a chain of tools through programming. To scale up the scope of the tools, we next propose a black-box probing method. This further empowers the LLM as a tool learner that can actively discover and document tool usages, teaching themselves to properly master new tools. For a comprehensive evaluation, we build a challenging benchmark named ToolFlow, which diverges from previous benchmarks by its long-term planning scenarios and complex toolset. Experiments on both existing datasets and ToolFlow illustrate the superiority of our framework. Analysis on different settings also validates the effectiveness and the utility of our black-box probing algorithm.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address two main challenges faced by Large Language Models (LLMs) when utilizing external tools: 1. **Manually designed workflows limit the generalization ability of LLMs**: Existing methods often require manually designed workflows, which restrict LLMs to local planning and result in poor performance in long-term tasks. 2. **Limitations of the toolset**: Current methods are limited to using manually demonstrated tools or built-in Python functions, restricting the learning and usage of new tools. To address these issues, the authors propose the following two research objectives: - **Enable LLMs to automatically generate tool chains**: By directly using a series of tools through programming. - **Make LLMs proactive multi-tool learners**: Allowing LLMs to self-learn and master new tools. To achieve these goals, the paper proposes a framework called Automatic Tool Chain (ATC) and introduces a black-box probing method to expand the toolset. Additionally, a new benchmark dataset named ToolFlow is created to evaluate the effectiveness of tool learning methods in complex scenarios. Experimental results show that this framework outperforms existing baseline methods across multiple datasets.