Introducing MAPO: Momentum-Aided Gradient Descent Prompt Optimization

Anthony Cui,Pranav Nandyalam,Ethan Cheung,Kevin Zhu
2024-11-02
Abstract:Momentum-Aided Prompt Optimization (MAPO) enhances the efficiency and efficacy of prompt optimization for Large Language Models (LLMs). Building on ProTeGi, MAPO uses positive natural language "gradients" and a momentum-based extension to refine prompts effectively. By tracking gradient history, MAPO avoids local minima and oscillations. It also utilizes beam search and an Upper Confidence Bound (UCB) algorithm for balanced candidate expansion and selection. Benchmark testing shows that MAPO achieves faster convergence time with fewer API calls and higher F1 scores than ProTeGi, proving it as a robust and scalable solution for automated prompt engineering in LLMs.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the prompt optimization problem in large - language models (LLMs). Specifically, the paper aims to improve the efficiency and effectiveness of automatic prompt engineering to overcome some limitations in existing methods, such as the time - consuming and error - prone manual adjustment of prompts, as well as the local minima and oscillation problems encountered by automated systems during the optimization process. ### Main contributions of the paper: 1. **Proposed a new optimization method**: Momentum - Aided Prompt Optimization (MAPO). This method improves the traditional natural - language gradient - descent method by introducing a momentum mechanism, thereby optimizing prompts more effectively. 2. **Avoid local minima and oscillation**: By tracking the gradient history, MAPO can avoid local minima and oscillation during the optimization process, ensuring a more stable and efficient optimization process. 3. **Reduce resource consumption**: Compared with existing methods (such as ProTeGi), MAPO significantly reduces the number of API calls and running time while achieving the same performance. 4. **Experimental verification**: The experimental results on the Liar and Ethos datasets show that MAPO is not only much faster than ProTeGi in convergence time but also has an improvement in the F1 score. ### Specific technical details: - **Positive natural - language gradient**: MAPO uses correct examples to generate positive natural - language gradients, which guide the LLM to make consistent directional adjustments in the semantic space. - **Momentum mechanism**: By recording the gradient history, MAPO uses the momentum mechanism to smooth the optimization path and avoid getting trapped in local minima and oscillation. - **Beam search and UCB algorithm**: MAPO combines beam search and the Upper Confidence Bound (UCB) algorithm to balance the expansion and selection of candidate prompts, ensuring the efficiency and accuracy of the optimization process. ### Experimental results: - **Efficiency improvement**: MAPO is approximately 2.5 times faster than ProTeGi on the Liar dataset and approximately 5 times faster on the Ethos dataset. - **Performance improvement**: MAPO is approximately 5.37% higher than ProTeGi in the F1 score and shows more stable performance during the optimization process. In summary, this paper provides a more efficient and stable automatic - prompt - optimization method by introducing the momentum mechanism and positive natural - language gradients, significantly improving the performance of large - language models in practical applications.