Abstract:Recent works show that assembling multiple off-the-shelf large language models (LLMs) can harness their complementary abilities. To achieve this, routing is a promising method, which learns a router to select the most suitable LLM for each query. However, existing routing models are ineffective when multiple LLMs perform well for a query. To address this problem, in this paper, we propose a method called query-based Router by Dual Contrastive learning (RouterDC). The RouterDC model consists of an encoder and LLM embeddings, and we propose two contrastive learning losses to train the RouterDC model. Experimental results show that RouterDC is effective in assembling LLMs and largely outperforms individual top-performing LLMs as well as existing routing methods on both in-distribution (+2.76\%) and out-of-distribution (+1.90\%) tasks. Source code is available at <a class="link-external link-https" href="https://github.com/shuhao02/RouterDC" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to effectively assemble multiple off - the - shelf large - scale language models (LLMs) to utilize their complementary capabilities. Specifically, existing methods are insufficient in selecting the best LLM for each query, especially when multiple LLMs perform well on the same query, and the existing routing models cannot effectively distinguish these LLMs. To solve this problem, the paper proposes a query - routing method based on dual - contrast learning (RouterDC), aiming to improve the selection accuracy and efficiency of the routing model. ### Problem Background Large - scale language models (LLMs) perform well in various tasks, but different models have their own advantages and disadvantages in different tasks. Therefore, assembling multiple LLMs together can better utilize their complementary capabilities. However, existing routing methods are not effective when dealing with multiple LLMs that perform well on the same query. For example, the ZOOTER method uses the scores of the reward model as a supervision signal, but in cases where multiple LLMs perform similarly, this method will lead to a small - probability distribution generated by the router, thus affecting the selection accuracy. ### Solution The RouterDC model proposed in the paper improves the learning process of the routing model by introducing dual - contrast learning. Specifically: 1. **Sample - LLM Contrastive Loss**: For each query, it is divided into positive samples (LLMs with good performance) and negative samples (LLMs with poor performance) according to the performance of the LLMs. Through the contrastive loss function, the query embedding vector is made closer to the embedding vector of the positive - sample LLMs and farther from the embedding vector of the negative - sample LLMs. \[ L_{\text{sample - LLM}}(x_i, y_i; \theta)=-\sum_{t^+\in I^+_i}\log\frac{\exp(\text{sim}(E(x_i; w), k_{t^+}))}{\exp(\text{sim}(E(x_i; w), k_{t^+}))+\sum_{t^-\in I^-_i}\exp(\text{sim}(E(x_i; w), k_{t^-}))} \] 2. **Sample - Sample Contrastive Loss**: To improve the training stability, the paper also introduces the sample - sample contrastive loss. Through clustering, the training queries are divided into multiple groups, and queries within the same group are encouraged to have similar embedding vectors, while the query embedding vectors between different groups are quite different. \[ L_{\text{sample - sample}}(x_i; \theta)=-\log\frac{\exp(\text{sim}(E(x_i; w), E(x^+_i; w)))}{\exp(\text{sim}(E(x_i; w), E(x^+_i; w)))+\sum_{x^-_i\in X^-_i}\exp(\text{sim}(E(x_i; w), E(x^-_i; w))} \] ### Experimental Results The experimental results show that RouterDC significantly outperforms existing routing methods and a single top - level LLM in both in - distribution tasks and out - of - distribution tasks. Specifically: - In in - distribution tasks, the average accuracy of RouterDC is improved by 2.76%. - In out - of - distribution tasks, the average accuracy of RouterDC is improved by 1.90%. - The inference speed of RouterDC is about 6 times faster than that of the voting method. In summary, by introducing the dual - contrast learning method, the paper successfully solves the selection problem of existing routing models when multiple LLMs perform similarly, and achieves significant improvements in both performance and efficiency.

RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models

RouteLLM: Learning to Route LLMs with Preference Data

RouterBench: A Benchmark for Multi-LLM Routing System

GraphRouter: A Graph-based Router for LLM Selections

A Unified Approach to Routing and Cascading for LLMs

TensorOpera Router: A Multi-Model Router for Efficient LLM Inference

PolyRouter: A Multi-LLM Querying System

RouterRetriever: Exploring the Benefits of Routing over Multiple Expert Embedding Models

Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models

Eagle: Efficient Training-Free Router for Multi-LLM Inference

Performance Characterization of Expert Router for Scalable LLM Inference

Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing

EmbedLLM: Learning Compact Representations of Large Language Models

Layerwise Recurrent Router for Mixture-of-Experts

Can Large Language Models Solve Robot Routing?

Exploring Domain Robust Lightweight Reward Models based on Router Mechanism

Routing Experts: Learning to Route Dynamic Experts in Multi-modal Large Language Models

Smoothie: Label Free Language Model Routing

One Person, One Model--Learning Compound Router for Sequential Recommendation

Real-time Adapting Routing (RAR): Improving Efficiency Through Continuous Learning in Software Powered by Layered Foundation Models

HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts