AdaSwitch: Adaptive Switching between Small and Large Agents for Effective Cloud-Local Collaborative Learning

Hao Sun,Jiayi Wu,Hengyi Cai,Xiaochi Wei,Yue Feng,Bo Wang,Shuaiqiang Wang,Yan Zhang,Dawei Yin
2024-10-17
Abstract:Recent advancements in large language models (LLMs) have been remarkable. Users face a choice between using cloud-based LLMs for generation quality and deploying local-based LLMs for lower computational cost. The former option is typically costly and inefficient, while the latter usually fails to deliver satisfactory performance for reasoning steps requiring deliberate thought processes. In this work, we propose a novel LLM utilization paradigm that facilitates the collaborative operation of large cloud-based LLMs and smaller local-deployed LLMs. Our framework comprises two primary modules: the local agent instantiated with a relatively smaller LLM, handling less complex reasoning steps, and the cloud agent equipped with a larger LLM, managing more intricate reasoning steps. This collaborative processing is enabled through an adaptive mechanism where the local agent introspectively identifies errors and proactively seeks assistance from the cloud agent, thereby effectively integrating the strengths of both locally-deployed and cloud-based LLMs, resulting in significant enhancements in task completion performance and efficiency. We evaluate AdaSwitch across 7 benchmarks, ranging from mathematical reasoning and complex question answering, using various types of LLMs to instantiate the local and cloud agents. The empirical results show that AdaSwitch effectively improves the performance of the local agent, and sometimes achieves competitive results compared to the cloud agent while utilizing much less computational overhead.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to balance the high - quality generation brought by large - scale LLMs deployed on the cloud and the low computational cost brought by small - scale LLMs deployed locally in the application of large - language models (LLMs). Specifically, although using large - scale LLMs on the cloud can provide high - quality generation results, it is costly and inefficient; while using small - scale LLMs locally has a lower computational cost, but usually cannot achieve satisfactory performance when dealing with tasks that require careful consideration. For this reason, the paper proposes a new LLMs utilization paradigm - ADASWITCH, aiming to solve this problem by promoting the collaborative operation between large - scale cloud LLMs and small - scale local LLMs. The ADASWITCH framework contains two main modules: a local agent (using a relatively small LLM to handle simpler reasoning steps) and a cloud - side agent (equipped with a larger LLM to manage more complex reasoning steps). Through an adaptive mechanism, the local agent can self - detect errors and actively seek help from the cloud - side agent, thus effectively integrating the advantages of local deployment and cloud - based LLMs, and significantly improving the performance and efficiency of task completion.