Abstract:Language Model Programs, i.e. sophisticated pipelines of modular language model (LM) calls, are increasingly advancing NLP tasks, but they require crafting prompts that are jointly effective for all modules. We study prompt optimization for LM programs, i.e. how to update these prompts to maximize a downstream metric without access to module-level labels or gradients. To make this tractable, we factorize our problem into optimizing the free-form instructions and few-shot demonstrations of every module and introduce several strategies to craft task-grounded instructions and navigate credit assignment across modules. Our strategies include (i) program- and data-aware techniques for proposing effective instructions, (ii) a stochastic mini-batch evaluation function for learning a surrogate model of our objective, and (iii) a meta-optimization procedure in which we refine how LMs construct proposals over time. Using these insights we develop MIPRO, a novel algorithm for optimizing LM programs. MIPRO outperforms baseline optimizers on five of seven diverse multi-stage LM programs using a best-in-class open-source model (Llama-3-8B), by as high as 13% accuracy. We have released our new optimizers and benchmark in DSPy at <a class="link-external link-http" href="http://dspy.ai" rel="external noopener nofollow">this http URL</a>

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **How to optimize the prompts in multi - stage language model programs to maximize the performance of downstream tasks without accessing module - level labels or gradients**. Specifically, the paper focuses on constructing complex language model (LM) programs, which are composed of multiple modules, and each module requires carefully - designed prompts. However, current methods usually design these prompts through manual trial - and - error, which is not only time - consuming but also difficult to ensure the effect. In addition, most of the existing optimization methods are not applicable to multi - stage language model programs because these methods lack supervision information for intermediate steps. To solve these problems, the paper proposes the following points: 1. **Define the problem**: The paper formally defines the problem of optimizing prompts in language model programs and identifies two key challenges: - **Proposal Challenge**: The prompt space is very large. Especially as the number of modules increases, it becomes very difficult to find high - quality prompts. - **Credit Assignment Challenge**: It is necessary to optimize multiple variables simultaneously, which parameterize the prompts of all modules. Therefore, it is necessary to effectively infer the impact of each configuration on the overall performance. 2. **Propose solutions**: To address these challenges, the paper introduces several strategies: - **Proposal strategies**: Including guided instruction generation, random search, and meta - optimization methods, which are used to efficiently generate candidate prompts. - **Credit assignment strategies**: Including greedy methods, surrogate models, and history - based methods, which are used to evaluate the impact of different configurations on the overall performance. 3. **Develop an optimizer**: Based on the above strategies, the paper develops a new optimizer MIPRO (Multi - prompt Instruction Proposal Optimizer) and tests it on multiple benchmark tasks. The results show that MIPRO significantly outperforms the baseline methods on most tasks. In summary, this paper aims to provide a systematic method to optimize prompts in multi - stage language model programs, thereby improving the performance of these programs in natural language processing tasks.

Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs

Large Language Models as Optimizers

PRompt Optimization in Multi-Step Tasks (PROMST): Integrating Human Feedback and Heuristic-based Sampling

Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together

Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System

OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language Models

Automatic Prompt Optimization with "Gradient Descent" and Beam Search

iPrOp: Interactive Prompt Optimization for Large Language Models with a Human in the Loop

AMPO: Automatic Multi-Branched Prompt Optimization

Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

OptiMUS-0.3: Using Large Language Models to Model and Solve Optimization Problems at Scale

PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization

A Comparative Study of DSPy Teleprompter Algorithms for Aligning Large Language Models Evaluation Metrics to Human Evaluation

PromptOptMe: Error-Aware Prompt Compression for LLM-based MT Evaluation Metrics

Self-Renewal Prompt Optimizing with Implicit Reasoning

GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers

MORL-Prompt: An Empirical Analysis of Multi-Objective Reinforcement Learning for Discrete Prompt Optimization

Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

PhaseEvo: Towards Unified In-Context Prompt Optimization for Large Language Models

Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers