Small LLMs Are Weak Tool Learners: A Multi-LLM Agent

Weizhou Shen,Chenliang Li,Hongzhan Chen,Ming Yan,Xiaojun Quan,Hehong Chen,Ji Zhang,Fei Huang
2024-02-16
Abstract:Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs, empowering them to interact with external tools (e.g., APIs, functions) and complete various tasks in a self-directed fashion. The challenge of tool use demands that LLMs not only understand user queries and generate answers accurately but also excel in task planning, tool invocation, and result summarization. While traditional works focus on training a single LLM with all these capabilities, performance limitations become apparent, particularly with smaller models. To overcome these challenges, we propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer. Each component is implemented by a single LLM that focuses on a specific capability and collaborates with others to accomplish the task. This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability. To effectively train this framework, we introduce a two-stage training paradigm. First, we fine-tune a backbone LLM on the entire dataset without discriminating sub-tasks, providing the model with a comprehensive understanding of the task. Second, the fine-tuned LLM is used to instantiate the planner, caller, and summarizer respectively, which are continually fine-tuned on respective sub-tasks. Evaluation across various tool-use benchmarks illustrates that our proposed multi-LLM framework surpasses the traditional single-LLM approach, highlighting its efficacy and advantages in tool learning.
Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
This paper mainly discusses the limitations of large language models (LLMs) in tool usage ability, and proposes a multi-LLM agent framework called α-UMi to address this issue. Traditional single-LLM methods attempt to equip a model with the ability to understand user queries, task planning, tool invocation, and result summarization simultaneously, but they perform poorly in small-scale models. The paper proposes to decompose these capabilities into planners, invokers, and summarizers, each implemented and working collaboratively by separate LLMs. Through the progressive fine-tuning strategy from global to local (GLPFT), the basic LLM is first fine-tuned on a complete dataset, and then further fine-tuned for specific tasks of each component. Experimental results show that the α-UMi framework outperforms traditional single-LLM methods in various tool usage benchmark tests, demonstrating its effectiveness and advantages in tool learning, especially for small-scale LLMs.