Autonomous Prompt Engineering in Large Language Models

Daan Kepel,Konstantina Valogianni
2024-06-25
Abstract:Prompt engineering is a crucial yet challenging task for optimizing the performance of large language models (LLMs) on customized tasks. This pioneering research introduces the Automatic Prompt Engineering Toolbox (APET), which enables GPT-4 to autonomously apply prompt engineering techniques. By leveraging sophisticated strategies such as Expert Prompting, Chain of Thought, and Tree of Thoughts, APET empowers GPT-4 to dynamically optimize prompts, resulting in substantial improvements in tasks like Word Sorting (4.4% increase) and Geometric Shapes (6.8% increase). Despite encountering challenges in complex tasks such as Checkmate in One (-14.8%), these findings demonstrate the transformative potential of APET in automating complex prompt optimization processes without the use of external data. Overall, this research represents a significant leap in AI development, presenting a robust framework for future innovations in autonomous AI systems and highlighting the ability of GPT-4 to bring prompt engineering theory to practice. It establishes a foundation for enhancing performance in complex task performance and broadening the practical applications of these techniques in real-world scenarios.
Computation and Language,Artificial Intelligence,Human-Computer Interaction
What problem does this paper attempt to address?
The main problem this paper attempts to address is the optimization of prompt engineering for large language models (LLMs) in customized tasks. Specifically, the researchers developed a system called the Automatic Prompt Engineering Toolbox (APET), aimed at enabling GPT-4 to autonomously apply prompt engineering techniques. By using complex strategies such as Expert Prompting, Chain of Thought, and Tree of Thoughts, APET can dynamically optimize prompts, significantly improving performance in specific tasks. For example, it enhanced performance in word sorting and geometric shapes tasks by 4.4% and 6.8%, respectively. Despite facing challenges in some complex tasks (such as the Checkmate in One chess task, where performance dropped by 14.8%), these findings demonstrate the immense potential of APET to automate the complex prompt optimization process without external data. Overall, this research represents a significant leap in the development of artificial intelligence and provides a solid foundational framework for future innovations in autonomous AI systems. It emphasizes the ability to put prompt engineering theories into practice, laying the groundwork for performance improvements in complex tasks and broadening the scope of practical applications.