The economic trade-offs of large language models: A case study

Kristen Howell,Gwen Christian,Pavel Fomitchov,Gitit Kehat,Julianne Marzulla,Leanne Rolston,Jadin Tredup,Ilana Zimmerman,Ethan Selfridge,Joseph Bradley
DOI: https://doi.org/10.48550/arXiv.2306.07402
2023-06-08
Computation and Language
Abstract:Contacting customer service via chat is a common practice. Because employing customer service agents is expensive, many companies are turning to NLP that assists human agents by auto-generating responses that can be used directly or with modifications. Large Language Models (LLMs) are a natural fit for this use case; however, their efficacy must be balanced with the cost of training and serving them. This paper assesses the practical cost and impact of LLMs for the enterprise as a function of the usefulness of the responses that they generate. We present a cost framework for evaluating an NLP model's utility for this use case and apply it to a single brand as a case study in the context of an existing agent assistance product. We compare three strategies for specializing an LLM - prompt engineering, fine-tuning, and knowledge distillation - using feedback from the brand's customer service agents. We find that the usability of a model's responses can make up for a large difference in inference cost for our case study brand, and we extrapolate our findings to the broader enterprise space.
What problem does this paper attempt to address?