SPRIG: Improving Large Language Model Performance by System Prompt Optimization

Lechen Zhang,Tolga Ergen,Lajanugen Logeswaran,Moontae Lee,David Jurgens

2024-10-25

Abstract:Large Language Models (LLMs) have shown impressive capabilities in many scenarios, but their performance depends, in part, on the choice of prompt. Past research has focused on optimizing prompts specific to a task. However, much less attention has been given to optimizing the general instructions included in a prompt, known as a system prompt. To address this gap, we propose SPRIG, an edit-based genetic algorithm that iteratively constructs prompts from prespecified components to maximize the model's performance in general scenarios. We evaluate the performance of system prompts on a collection of 47 different types of tasks to ensure generalizability. Our study finds that a single optimized system prompt performs on par with task prompts optimized for each individual task. Moreover, combining system and task-level optimizations leads to further improvement, which showcases their complementary nature. Experiments also reveal that the optimized system prompts generalize effectively across model families, parameter sizes, and languages. This study provides insights into the role of system-level instructions in maximizing LLM potential.

Computation and Language,Artificial Intelligence,Human-Computer Interaction,Machine Learning

What problem does this paper attempt to address?

The problem this paper attempts to address is: how to improve the performance of large language models (LLMs) across various tasks by optimizing system prompts. Specifically, existing research mainly focuses on optimizing prompts for specific tasks, but pays less attention to optimizing system prompts. System prompts refer to a set of general instructions that precede any specific task details. The paper proposes a genetic algorithm-based method—SPRIG (System Prompt Refinement for Increased Generalization), which is used to iteratively construct and optimize system prompts to maximize the model's performance in general scenarios. The main contributions of the paper include: 1. Optimizing system prompts can produce performance improvements comparable to task-specific optimizations, even when these prompts contain general task instructions. 2. The optimization of system prompts and task prompts is complementary, and their combined use can further improve performance. 3. The optimized system prompts exhibit good generalization across different models, parameter scales, and languages. Through this research, the paper provides new insights into the role of system-level instructions in maximizing the potential of LLMs.

SPRIG: Improving Large Language Model Performance by System Prompt Optimization

SPRIG: Improving Large Language Model Performance by System Prompt Optimization

Are Large Language Models Good Prompt Optimizers?

Towards Goal-oriented Prompt Engineering for Large Language Models: A Survey

Efficient Prompting Methods for Large Language Models: A Survey

GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers

Automatic Prompt Selection for Large Language Models

Unleashing the Potential of Large Language Models as Prompt Optimizers: An Analogical Analysis with Gradient-based Model Optimizers

MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization

Robustness-aware Automatic Prompt Optimization

The language of prompting: What linguistic properties make a prompt successful?

iPrOp: Interactive Prompt Optimization for Large Language Models with a Human in the Loop

A Study on Performance Improvement of Prompt Engineering for Generative AI with a Large Language Model

Automatic Engineering of Long Prompts

Supervisory Prompt Training

GRL-Prompt: Towards Knowledge Graph based Prompt Optimization via Reinforcement Learning

PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization

A Systematic Review on Prompt Engineering in Large Language Models for K-12 STEM Education

Connecting large language models with evolutionary algorithms yields powerful prompt optimizers

Robust Prompt Optimization for Large Language Models Against Distribution Shifts