Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum

Shen Gao,Zhengliang Shi,Minghang Zhu,Bowen Fang,Xin Xin,Pengjie Ren,Zhumin Chen,Jun Ma,Zhaochun Ren
2023-12-21
Abstract:Augmenting large language models (LLMs) with external tools has emerged as a promising approach to extending the capability of LLMs. Although some works employ open-source LLMs for the tool learning task, most of them are trained in a controlled environment in which LLMs only learn to execute the human-provided tools. However, selecting proper tools from the large toolset is also a crucial ability for the tool learning model to be applied in real-world applications. Existing methods usually directly employ self-instruction methods to train the model, which ignores differences in tool complexity. In this paper, we propose the Confucius, a novel tool learning framework to train LLM to use complicated tools in real-world scenarios, which contains two main phases: (1) We first propose a multi-stage learning method to teach the LLM to use various tools from an easy-to-difficult curriculum; (2) thenceforth, we propose the Iterative Self-instruct from Introspective Feedback (ISIF) to dynamically construct the dataset to improve the ability to use the complicated tool. Extensive experiments conducted on both controlled and real-world settings demonstrate the superiority of our tool learning framework in the real-world application scenarios compared to both tuning-free (e.g. ChatGPT, Claude) and tuning-based baselines (e.g. GPT4Tools).
Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to train large - language models (LLMs) to use complex tools in real - world scenarios. Specifically, the paper focuses on two main challenges: 1. **Tool Selection**: In practical applications, large - language models need to select appropriate tools from a large set of tools to complete tasks. Existing methods usually directly provide a minimally necessary set of tools to the model without redundant tools. However, when adapting to real - world application scenarios, the model usually faces large tool sets containing various cross - task tools. Therefore, how to teach the model to select appropriate tools from candidate tools becomes the first challenge. 2. **Use of Complex Tools**: Different tools have different levels of difficulty in use, and some tools are used differently in different scenarios. For example, the Google Maps tool only needs to provide the current coordinates when traveling, but when planning a commuting route, more information such as the starting point, the ending point, and preferences need to be specified. In order to better interact with these complex tools, the model needs to be trained to use these tools in a variety of different scenarios. Therefore, the second challenge lies in how to identify which tools are more complex and improve the ability to use these complex tools. To solve the above problems, the paper proposes a tool - learning framework named "Confucius", which contains two main stages: 1. **Multi - stage Learning Method**: Teach the model to use various tools through a curriculum arrangement from simple to complex. Specifically, it is divided into three training stages: - **Warm - up Training Stage**: Provide the model with the minimally required set of tools, aiming to teach the model to correctly schedule and execute tools. - **Same - class Training Stage**: Teach the model to select appropriate tools among related candidate tools. - **Cross - class Training Stage**: Train the model in real - world application settings, and the candidate tool set is generated by a tool retriever through semantic matching. 2. **Iterative Self - Instruction and Introspective Feedback** (ISIF): Improve the model's ability to use complex tools by dynamically constructing a data set. ISIF includes two stages: - **Instance Generation**: Generate initial tool - use instance data from a diverse set of tools. - **Update and Introspective Feedback**: Generate more instances related to complex tools according to the model's introspective feedback on tool use to update the data set. Through these two stages, the Confucius framework can effectively improve the model's tool - use ability in real - world applications, especially when dealing with unseen tools. Experimental results show that Confucius outperforms existing tuning - free and tuning - baseline methods on multiple evaluation metrics.