OptLLM: Optimal Assignment of Queries to Large Language Models

Yueyue Liu,Hongyu Zhang,Yuantian Miao,Van-Hoang Le,Zhiqiang Li

2024-05-24

Abstract:Large Language Models (LLMs) have garnered considerable attention owing to their remarkable capabilities, leading to an increasing number of companies offering LLMs as services. Different LLMs achieve different performance at different costs. A challenge for users lies in choosing the LLMs that best fit their needs, balancing cost and performance. In this paper, we propose a framework for addressing the cost-effective query allocation problem for LLMs. Given a set of input queries and candidate LLMs, our framework, named OptLLM, provides users with a range of optimal solutions to choose from, aligning with their budget constraints and performance preferences, including options for maximizing accuracy and minimizing cost. OptLLM predicts the performance of candidate LLMs on each query using a multi-label classification model with uncertainty estimation and then iteratively generates a set of non-dominated solutions by destructing and reconstructing the current solution. To evaluate the effectiveness of OptLLM, we conduct extensive experiments on various types of tasks, including text classification, question answering, sentiment analysis, reasoning, and log parsing. Our experimental results demonstrate that OptLLM substantially reduces costs by 2.40% to 49.18% while achieving the same accuracy as the best LLM. Compared to other multi-objective optimization algorithms, OptLLM improves accuracy by 2.94% to 69.05% at the same cost or saves costs by 8.79% and 95.87% while maintaining the highest attainable accuracy.

Software Engineering,Computation and Language,Machine Learning

What problem does this paper attempt to address?

The main problem addressed in this paper is how to effectively allocate queries among multiple large language models (LLMs) to balance cost and performance. As LLMs become more prevalent, users need to make choices between models with different performance and prices. The paper proposes a framework called OptLLM, which uses multi-objective optimization methods to predict the performance of LLMs for each query and generates a set of non-dominated solutions, allowing users to select based on budget and performance preferences. OptLLM predicts performance using a multi-label classification model and takes uncertainty into account, and then iteratively generates non-dominated solutions. Experimental results show that OptLLM can maintain comparable accuracy to the best LLM while reducing costs, and in some cases, even improve accuracy or further reduce costs.

OptLLM: Optimal Assignment of Queries to Large Language Models

SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models

ScaleLLM: A Resource-Frugal LLM Serving Framework by Optimizing End-to-End Efficiency

MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs

Cost-Effective Online Multi-LLM Selection with Versatile Reward Models

Solving General Natural-Language-Description Optimization Problems with Large Language Models

LLM Cascade with Multi-Objective Optimal Consideration

SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization

Towards Optimizing with Large Language Models

Exploring Accuracy-Fairness Trade-off in Large Language Models

Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing

One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments

Large Language Models for Supply Chain Optimization

Optimizing Numerical Estimation and Operational Efficiency in the Legal Domain through Large Language Models

LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch

OptiMUS-0.3: Using Large Language Models to Model and Solve Optimization Problems at Scale

Using Large Language Models for Hyperparameter Optimization

When Large Language Model Meets Optimization

LLMs are Highly-Constrained Biophysical Sequence Optimizers

OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models

All Language Models Large and Small