Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments

Yu Gu,Yiheng Shu,Hao Yu,Xiao Liu,Yuxiao Dong,Jie Tang,Jayanth Srinivasa,Hugo Latapie,Yu Su
2024-10-04
Abstract:The applications of large language models (LLMs) have expanded well beyond the confines of text processing, signaling a new era where LLMs are envisioned as generalist agents capable of operating within complex environments. These environments are often highly expansive, making it impossible for the LLM to process them within its short-term memory. Motivated by recent research on extending the capabilities of LLMs with tools, we seek to investigate the intriguing potential of tools to augment LLMs in handling such complexity by introducing a novel class of tools, termed middleware, to aid in the proactive exploration within these massive environments. Such specialized tools can serve as a middleware layer shielding the LLM from environmental complexity. In two representative complex environments -- knowledge bases (KBs) and databases -- we demonstrate the significant potential of augmenting language agents with tools in complex environments. Notably, equipped with the middleware, GPT-4 achieves 2.8X the performance of the best baseline in tasks requiring access to database content and 2.2X in KB tasks. Our findings illuminate the path for advancing language agents in real-world applications.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the limitations of large language models (LLMs) in handling complex environments. Specifically, LLMs struggle to process complex environments (such as knowledge bases and databases) due to the vast scale of these environments, making it impossible to load all the information into short-term memory at once. Therefore, the authors propose a new framework that enhances the capabilities of LLMs by introducing middleware tools, enabling them to navigate and operate more effectively in these complex environments. ### Main Issues 1. **Handling Complex Environments**: Existing LLMs face challenges when dealing with complex environments, especially when accessing large-scale datasets or knowledge bases. Traditional linearization methods (i.e., converting environment descriptions into a series of discrete tokens) encounter scalability issues when processing large-scale data. 2. **Tool Enhancement**: How to enhance LLMs with tools to enable more effective navigation and operation in complex environments, thereby improving their performance. ### Solutions 1. **Middleware Tools**: The authors designed a set of tools specifically for complex environments, called middleware. These tools act as an intermediary layer between LLMs and the environment, helping LLMs actively explore and acquire necessary information without directly handling all the details of the environment. 2. **Tool Usage Strategies**: To fully leverage the reasoning capabilities of LLMs, the authors proposed two new tool usage strategies: - **Error Feedback**: Providing specific error information when LLMs make mistakes using the tools, guiding LLMs to autonomously correct errors. - **Decoupled Generation**: Separating the reasoning steps of LLMs from tool usage to improve control and accuracy. ### Experimental Results 1. **Database Tasks**: On the BIRD dataset, GPT-4 equipped with middleware tools improved performance by 2.8 times (from 13.8% to 38.3%) in tasks requiring access to database content. 2. **Knowledge Base Tasks**: On the KBQA-AGENT dataset, GPT-4 equipped with middleware tools improved performance by 2.2 times (from 27.1% to 59.3%) in multi-hop reasoning tasks. ### Main Contributions 1. **New Framework**: Developed a new framework to study the role of LLMs in handling complex environments through customized tools. 2. **Comprehensive Evaluation**: Conducted detailed benchmarking of six different LLMs, validating the effectiveness of tool enhancement. 3. **Key Findings**: Demonstrated that tool enhancement significantly improves the performance of LLMs in handling complex environments, providing new possibilities for applying LLMs to real-world applications. In summary, this paper significantly enhances the capabilities and performance of LLMs in handling complex environments by introducing middleware tools and new tool usage strategies.