Large Language Models Prompting With Episodic Memory

Dai Do,Quan Tran,Svetha Venkatesh,Hung Le
2024-08-14
Abstract:Prompt optimization is essential for enhancing the performance of Large Language Models (LLMs) in a range of Natural Language Processing (NLP) tasks, particularly in scenarios of few-shot learning where training examples are incorporated directly into the prompt. Despite the growing interest in optimizing prompts with few-shot examples, existing methods for prompt optimization are often resource-intensive or perform inadequately. In this work, we propose PrOmpting with Episodic Memory (POEM), a novel prompt optimization technique that is simple, efficient, and demonstrates strong generalization capabilities. We approach prompt optimization as a Reinforcement Learning (RL) challenge, using episodic memory to archive combinations of input data, permutations of few-shot examples, and the rewards observed during training. In the testing phase, we optimize the sequence of examples for each test query by selecting the sequence that yields the highest total rewards from the top-k most similar training examples in the episodic memory. Our results show that POEM outperforms recent techniques like TEMPERA and RLPrompt by over 5.3% in various text classification tasks. Furthermore, our approach adapts well to broader language understanding tasks, consistently outperforming conventional heuristic methods for ordering examples.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the paper attempts to solve The paper "Prompting With Episodic Memory" aims to solve the problem of how to optimize the prompt content of large - scale language models (LLMs) in few - shot learning, especially the order of examples in the prompt. Specifically, the paper focuses on the following points: 1. **Importance of prompt optimization**: - Prompt optimization is crucial for enhancing the performance of large - scale language models in various natural language processing (NLP) tasks, especially in the few - shot learning scenario where training examples are directly incorporated into the prompt. 2. **Limitations of existing methods**: - Existing prompt optimization methods are usually resource - intensive or have insufficient performance. For example, early methods such as "soft prompts" need to obtain gradients from large - scale language models to construct prompts, which not only increases the computational cost but also faces challenges in interpretability and quality. - The current state - of - the - art methods turn to discrete prompt optimization. Although they improve interpretability, they still have problems such as lack of optimization principles and query - independence, resulting in success in some cases and failure in others. 3. **Proposed new method**: - This paper proposes a new method named PrOmpting with Episodic Memory (POEM), which optimizes the order of examples in the prompt through episodic memory. POEM regards prompt optimization as a reinforcement learning (RL) problem and uses episodic memory to store input data, example arrangements and the rewards observed during the training process. - In the testing phase, POEM optimizes the example order for each test query by selecting the sequence that obtains the highest total reward among the most similar training examples in the episodic memory. ### Main contributions 1. **Efficient and general - purpose optimization method**: - The POEM method is simple, efficient, and shows strong generalization ability, and is suitable for various text classification tasks and other language understanding tasks. 2. **Significant performance improvement**: - The experimental results show that POEM significantly outperforms recent techniques such as TEMPERA and RLPrompt in multiple text classification tasks, with a performance improvement of more than 5.3%. - In more complex tasks such as common - sense reasoning and question answering, POEM also performs well and significantly outperforms traditional heuristic methods. 3. **Robustness and stability**: - The performance of POEM on different datasets is more stable with a smaller variance, indicating that its performance in different tasks is more reliable. ### Conclusion By introducing the POEM method, this paper effectively solves the difficult problem of prompt optimization in few - shot learning and provides a new solution for improving the performance of large - scale language models in multiple NLP tasks.