Abstract:We introduce meta-prompting, an effective scaffolding technique designed to enhance the functionality of language models (LMs). This approach transforms a single LM into a multi-faceted conductor, adept at managing and integrating multiple independent LM queries. By employing high-level instructions, meta-prompting guides the LM to break down complex tasks into smaller, more manageable subtasks. These subtasks are then handled by distinct "expert" instances of the same LM, each operating under specific, tailored instructions. Central to this process is the LM itself, in its role as the conductor, which ensures seamless communication and effective integration of the outputs from these expert models. It additionally employs its inherent critical thinking and robust verification processes to refine and authenticate the end result. This collaborative prompting approach empowers a single LM to simultaneously act as a comprehensive orchestrator and a panel of diverse experts, significantly enhancing its performance across a wide array of tasks. The zero-shot, task-agnostic nature of meta-prompting greatly simplifies user interaction by obviating the need for detailed, task-specific instructions. Furthermore, our research demonstrates the seamless integration of external tools, such as a Python interpreter, into the meta-prompting framework, thereby broadening its applicability and utility. Through rigorous experimentation with GPT-4, we establish the superiority of meta-prompting over conventional scaffolding methods: When averaged across all tasks, including the Game of 24, Checkmate-in-One, and Python Programming Puzzles, meta-prompting, augmented with a Python interpreter functionality, surpasses standard prompting by 17.1%, expert (dynamic) prompting by 17.3%, and multipersona prompting by 15.2%.

A Comparative Study of DSPy Teleprompter Algorithms for Aligning Large Language Models Evaluation Metrics to Human Evaluation

Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

CoPrompter: User-Centric Evaluation of LLM Instruction Alignment for Improved Prompt Engineering

Supervisory Prompt Training

Automatic Prompt Optimization with "Gradient Descent" and Beam Search

DecoPrompt : Decoding Prompts Reduces Hallucinations when Large Language Models Meet False Premises

PRompt Optimization in Multi-Step Tasks (PROMST): Integrating Human Feedback and Heuristic-based Sampling

Black-Box Prompt Optimization: Aligning Large Language Models without Model Training

Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together

PACE: Improving Prompt with Actor-Critic Editing for Large Language Model

PromptAid: Prompt Exploration, Perturbation, Testing and Iteration using Visual Analytics for Large Language Models

Toward Large Language Models as a Therapeutic Tool: Comparing Prompting Techniques to Improve GPT-Delivered Problem-Solving Therapy

Which Prompts Make The Difference? Data Prioritization For Efficient Human LLM Evaluation

Show, Don't Tell: Aligning Language Models with Demonstrated Feedback

Prompt Space Optimizing Few-shot Reasoning Success with Large Language Models

Are Large Language Models Good Prompt Optimizers?

AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations

Optimising Hard Prompts with Few-Shot Meta-Prompting

Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding

Which is better? Exploring Prompting Strategy For LLM-based Metrics