Efficient and Scalable Estimation of Tool Representations in Vector Space

Suhong Moon,Siddharth Jha,Lutfi Eren Erdogan,Sehoon Kim,Woosang Lim,Kurt Keutzer,Amir Gholami
2024-09-03
Abstract:Recent advancements in function calling and tool use have significantly enhanced the capabilities of large language models (LLMs) by enabling them to interact with external information sources and execute complex tasks. However, the limited context window of LLMs presents challenges when a large number of tools are available, necessitating efficient methods to manage prompt length and maintain accuracy. Existing approaches, such as fine-tuning LLMs or leveraging their reasoning capabilities, either require frequent retraining or incur significant latency overhead. A more efficient solution involves training smaller models to retrieve the most relevant tools for a given query, although this requires high quality, domain-specific data. To address those challenges, we present a novel framework for generating synthetic data for tool retrieval applications and an efficient data-driven tool retrieval strategy using small encoder models. Empowered by LLMs, we create ToolBank, a new tool retrieval dataset that reflects real human user usages. For tool retrieval methodologies, we propose novel approaches: (1) Tool2Vec: usage-driven tool embedding generation for tool retrieval, (2) ToolRefiner: a staged retrieval method that iteratively improves the quality of retrieved tools, and (3) MLC: framing tool retrieval as a multi-label classification problem. With these new methods, we achieve improvements of up to 27.28 in Recall@K on the ToolBench dataset and 30.5 in Recall@K on ToolBank. Additionally, we present further experimental results to rigorously validate our methods. Our code is available at \url{<a class="link-external link-https" href="https://github.com/SqueezeAILab/Tool2Vec" rel="external noopener nofollow">this https URL</a>}
Machine Learning,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to retrieve relevant tools efficiently and accurately in a limited context window when using large - language models (LLMs) for tool invocation. Specifically, when the number of available tools is large, how to effectively manage the prompt length while maintaining accuracy is an important challenge. Existing methods, such as fine - tuning LLMs or using their reasoning ability to select tools, either require frequent retraining or bring significant latency overhead. And the tool - retrieval methods based on descriptions perform poorly because there is an obvious semantic gap between tool descriptions and user queries. For this reason, the paper proposes a use - driven tool - embedding generation method (Tool2Vec) and a two - stage tool - retrieval technique, aiming to improve the efficiency and accuracy of tool retrieval. ### Main Contributions 1. **ToolBank Dataset**: A new high - quality domain - specific tool - retrieval dataset, ToolBank, is constructed, and three new datasets are instantiated within this framework. In the quality evaluation, these datasets achieved a 60% higher winning rate than ToolBench queries through the evaluation of GPT - 4 - turbo. 2. **Tool2Vec**: A use - based tool - embedding generation method is proposed instead of the traditional method relying on tool descriptions. In addition, a two - stage tool - retrieval method is introduced, which gradually improves the quality of retrieved tools through a "retrieve - then - refine" scheme. 3. **Performance Improvement**: On the most difficult ToolBench split, the recall rate of this method is more than 25% higher than that of the ToolBench retriever. On domain - specific datasets, the recall rate of this method is more than 30% higher than that of the description - based retrieval method. ### Method Overview 1. **Tool2Vec**: Tool embeddings are generated by using user queries related to specific tools instead of tool descriptions. This can reduce the distribution gap between queries and tool embeddings and improve the retrieval accuracy. 2. **Multi - Label Classification (MLC)**: The tool - retrieval problem is transformed into a multi - label classification problem, and the model is trained to predict whether each tool is relevant to a given query. 3. **ToolRefiner**: As a refinement tool in the second stage, it further optimizes the candidate tools retrieved in the first stage. ToolRefiner improves the retrieval accuracy by considering tool - query and tool - tool interactions. ### Experimental Results The paper verifies the effectiveness of the proposed method on multiple benchmark datasets, including ToolBench and ToolBank. The experimental results show that the proposed method significantly outperforms existing baseline methods in all metrics, especially in the Recall@K metric. ### Conclusion By introducing Tool2Vec and the two - stage tool - retrieval method, the paper effectively solves the problem of efficiently and accurately retrieving relevant tools among a large number of tools, providing a new solution for tool invocation in large - language - model practical applications.