Efficient Prompting Methods for Large Language Models: A Survey

Kaiyan Chang,Songcheng Xu,Chenglong Wang,Yingfeng Luo,Tong Xiao,Jingbo Zhu
2024-04-01
Abstract:Prompting has become a mainstream paradigm for adapting large language models (LLMs) to specific natural language processing tasks. While this approach opens the door to in-context learning of LLMs, it brings the additional computational burden of model inference and human effort of manual-designed prompts, particularly when using lengthy and complex prompts to guide and control the behavior of LLMs. As a result, the LLM field has seen a remarkable surge in efficient prompting methods. In this paper, we present a comprehensive overview of these methods. At a high level, efficient prompting methods can broadly be categorized into two approaches: prompting with efficient computation and prompting with efficient design. The former involves various ways of compressing prompts, and the latter employs techniques for automatic prompt optimization. We present the basic concepts of prompting, review the advances for efficient prompting, and highlight future research directions.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the efficiency issues faced when using prompts in specific natural language processing tasks with large language models (LLMs). Specifically, the paper focuses on the following two main problems: 1. **Computational Burden**: Long and complex prompts increase the computational cost of model inference. The model's performance is particularly affected when it needs to handle a large amount of contextual information. 2. **Manual Design Cost**: Designing high-quality prompts manually requires a significant amount of time and effort, especially when creating complex and detailed prompts. To address these issues, the paper provides a comprehensive review of existing efficient prompting methods and categorizes them into two main types: - **Efficient Computational Prompts**: These methods aim to reduce the consumption of computational resources by compressing prompts. Techniques include knowledge distillation, encoding, and filtering. - **Efficient Design Prompts**: These methods aim to improve efficiency by automatically optimizing prompt design. Techniques include gradient-based methods and intelligent algorithm-based methods. Through these methods, the paper hopes to provide researchers and developers with effective strategies to save financial and human resources, thereby promoting the widespread use of large language models in academic research and commercial applications.