MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization

Yuyan Chen,Zhihao Wen,Ge Fan,Zhengyu Chen,Wei Wu,Dayiheng Liu,Zhixu Li,Bang Liu,Yanghua Xiao
2024-07-05
Abstract:Prompt engineering, as an efficient and effective way to leverage Large Language Models (LLM), has drawn a lot of attention from the research community. The existing research primarily emphasizes the importance of adapting prompts to specific tasks, rather than specific LLMs. However, a good prompt is not solely defined by its wording, but also binds to the nature of the LLM in question. In this work, we first quantitatively demonstrate that different prompts should be adapted to different LLMs to enhance their capabilities across various downstream tasks in NLP. Then we novelly propose a model-adaptive prompt optimizer (MAPO) method that optimizes the original prompts for each specific LLM in downstream tasks. Extensive experiments indicate that the proposed method can effectively refine prompts for an LLM, leading to significant improvements over various downstream tasks.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the issue of performance fluctuations in large language models (LLMs) across different natural language processing tasks due to variations in prompt quality. Specifically, the paper makes the following contributions: 1. **Quantitative Proof of Adaptability**: The paper first demonstrates through experiments that different prompts should be optimized for different large language models to improve their performance in various downstream tasks. 2. **Proposing the MAPO Method**: It introduces a new method called Model-Adaptive Prompt Optimization (MAPO), which is specifically designed to optimize the original prompts for specific LLMs, thereby enhancing their performance in downstream tasks. 3. **Experimental Validation**: Extensive experiments show that the proposed MAPO method can significantly improve the robustness and generalization ability of LLMs in various downstream tasks, achieving results superior to existing methods.