Abstract:Prompt optimization is essential for enhancing the performance of Large Language Models (LLMs) in a range of Natural Language Processing (NLP) tasks, particularly in scenarios of few-shot learning where training examples are incorporated directly into the prompt. Despite the growing interest in optimizing prompts with few-shot examples, existing methods for prompt optimization are often resource-intensive or perform inadequately. In this work, we propose PrOmpting with Episodic Memory (POEM), a novel prompt optimization technique that is simple, efficient, and demonstrates strong generalization capabilities. We approach prompt optimization as a Reinforcement Learning (RL) challenge, using episodic memory to archive combinations of input data, permutations of few-shot examples, and the rewards observed during training. In the testing phase, we optimize the sequence of examples for each test query by selecting the sequence that yields the highest total rewards from the top-k most similar training examples in the episodic memory. Our results show that POEM outperforms recent techniques like TEMPERA and RLPrompt by over 5.3% in various text classification tasks. Furthermore, our approach adapts well to broader language understanding tasks, consistently outperforming conventional heuristic methods for ordering examples.

What problem does this paper attempt to address?

### Problems the paper attempts to solve The paper "Prompting With Episodic Memory" aims to solve the problem of how to optimize the prompt content of large - scale language models (LLMs) in few - shot learning, especially the order of examples in the prompt. Specifically, the paper focuses on the following points: 1. **Importance of prompt optimization**: - Prompt optimization is crucial for enhancing the performance of large - scale language models in various natural language processing (NLP) tasks, especially in the few - shot learning scenario where training examples are directly incorporated into the prompt. 2. **Limitations of existing methods**: - Existing prompt optimization methods are usually resource - intensive or have insufficient performance. For example, early methods such as "soft prompts" need to obtain gradients from large - scale language models to construct prompts, which not only increases the computational cost but also faces challenges in interpretability and quality. - The current state - of - the - art methods turn to discrete prompt optimization. Although they improve interpretability, they still have problems such as lack of optimization principles and query - independence, resulting in success in some cases and failure in others. 3. **Proposed new method**: - This paper proposes a new method named PrOmpting with Episodic Memory (POEM), which optimizes the order of examples in the prompt through episodic memory. POEM regards prompt optimization as a reinforcement learning (RL) problem and uses episodic memory to store input data, example arrangements and the rewards observed during the training process. - In the testing phase, POEM optimizes the example order for each test query by selecting the sequence that obtains the highest total reward among the most similar training examples in the episodic memory. ### Main contributions 1. **Efficient and general - purpose optimization method**: - The POEM method is simple, efficient, and shows strong generalization ability, and is suitable for various text classification tasks and other language understanding tasks. 2. **Significant performance improvement**: - The experimental results show that POEM significantly outperforms recent techniques such as TEMPERA and RLPrompt in multiple text classification tasks, with a performance improvement of more than 5.3%. - In more complex tasks such as common - sense reasoning and question answering, POEM also performs well and significantly outperforms traditional heuristic methods. 3. **Robustness and stability**: - The performance of POEM on different datasets is more stable with a smaller variance, indicating that its performance in different tasks is more reliable. ### Conclusion By introducing the POEM method, this paper effectively solves the difficult problem of prompt optimization in few - shot learning and provides a new solution for improving the performance of large - scale language models in multiple NLP tasks.

Large Language Models Prompting With Episodic Memory

POEM: Interactive Prompt Optimization for Enhancing Multimodal Reasoning of Large Language Models

Optimising Hard Prompts with Few-Shot Meta-Prompting

RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning

QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning

PhaseEvo: Towards Unified In-Context Prompt Optimization for Large Language Models

Are Large Language Models Good Prompt Optimizers?

BatchPrompt: Accomplish more with less

Efficient and Accurate Prompt Optimization: the Benefit of Memory in Exemplar-Guided Reflection

MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization

Prompt Space Optimizing Few-shot Reasoning Success with Large Language Models

GRL-Prompt: Towards Knowledge Graph based Prompt Optimization via Reinforcement Learning

Automatic Prompt Selection for Large Language Models

EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus

Towards Generalist Prompting for Large Language Models by Mental Models

Helping Language Models Learn More: Multi-dimensional Task Prompt for Few-shot Tuning

PRompt Optimization in Multi-Step Tasks (PROMST): Integrating Human Feedback and Heuristic-based Sampling

Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot Performance

PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine

Prompt Exploration with Prompt Regression

Prompt Optimization in Large Language Models