Abstract:As large language models (LLMs) advance, their inability to autonomously execute tasks by directly interacting with external tools remains a critical limitation. Traditional methods rely on inputting tool descriptions as context, which is constrained by context length and requires separate, often inefficient, retrieval mechanisms. We introduce ToolGen, a paradigm shift that integrates tool knowledge directly into the LLM's parameters by representing each tool as a unique token. This enables the LLM to generate tool calls and arguments as part of its next token prediction capabilities, seamlessly blending tool invocation with language generation. Our framework allows the LLM to access and utilize a vast amount of tools with no additional retrieval step, significantly enhancing both performance and scalability. Experimental results with over 47,000 tools show that ToolGen not only achieves superior results in both tool retrieval and autonomous task completion but also sets the stage for a new era of AI agents that can adapt to tools across diverse domains. By fundamentally transforming tool retrieval into a generative process, ToolGen paves the way for more versatile, efficient, and autonomous AI systems. ToolGen enables end-to-end tool learning and opens opportunities for integration with other advanced techniques such as chain-of-thought and reinforcement learning, thereby expanding the practical capabilities of LLMs.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper attempts to address the key limitations of large - language models (LLMs) in directly interacting with external tools to perform tasks autonomously. Specifically, traditional methods rely on tool descriptions as context inputs, which are limited by the context length and require separate, often inefficient retrieval mechanisms. These issues lead to the following main challenges: 1. **Context length limitation**: In traditional methods, tool descriptions need to be input as context, but the context length is limited and cannot handle a large number of tools. 2. **Inefficient retrieval mechanism**: Traditional retrieval mechanisms usually rely on small encoders, which cannot fully capture the semantics of complex tools and queries, resulting in low retrieval efficiency. 3. **Separation of retrieval and execution**: Traditional methods separate retrieval and execution, which introduces inefficiency and potential inconsistency between task completion stages. 4. **Lack of tool - related knowledge**: Existing LLMs are mainly pre - trained on natural - language data and lack inherent knowledge of tool - related functions, resulting in poor performance when relying on retrieved tool descriptions for decision - making. To solve these problems, the authors propose ToolGen, a new framework that directly integrates tool knowledge into the parameters of the LLM by representing each tool as a unique virtual token. ToolGen enables the LLM to generate tool calls and parameters as part of its next - token prediction ability, thus seamlessly combining tool calls with language generation. This framework allows the LLM to access and utilize a large number of tools without additional retrieval steps, significantly improving performance and scalability. ### Main contributions 1. **Novel framework**: ToolGen integrates tool retrieval and execution into the generation process of the LLM using virtual tokens. 2. **Three - stage training process**: The training process of ToolGen includes tool memory, retrieval training, and end - to - end agent training, making tool retrieval and API calls more efficient and scalable. 3. **Experimental verification**: Experimental results show that ToolGen achieves performance comparable to the current best tool retrieval methods in tool retrieval and LLM - based agent tasks, but at a lower cost and higher efficiency, and surpasses the traditional tool - learning paradigm in large - scale tool libraries. Through these innovations, ToolGen paves the way for more flexible, efficient, and autonomous AI systems that can adapt to various tools in diverse fields.

ToolGen: Unified Tool Retrieval and Calling via Generation

ToolNet: Connecting Large Language Models with Massive Tools via Tool Graph

Chain of Tools: Large Language Model is an Automatic Multi-tool Learner

Improving Tool Retrieval by Leveraging Large Language Models for Query Generation

MetaTool: Facilitating Large Language Models to Master Tools with Meta-task Augmentation

ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings

MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning

Large Language Models as Tool Makers

ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases

OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs

Creative Robot Tool Use with Large Language Models

Teaching Code LLMs to Use Autocompletion Tools in Repository-Level Code Generation

Learning Generalizable Tool-use Skills through Trajectory Generation

TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents

CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models

GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing

TPTU: Large Language Model-based AI Agents for Task Planning and Tool Usage

Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models

StepTool: A Step-grained Reinforcement Learning Framework for Tool Learning in LLMs

CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets