Abstract:Large language models (LLMs) have demonstrated impressive performance on many tasks. However, to achieve optimal performance, specially designed prompting methods are still needed. These methods either rely on task-specific few-shot examples that require a certain level of domain knowledge, or are designed to be simple but only perform well on a few types of tasks. In this work, we attempt to introduce the concept of generalist prompting, which operates on the design principle of achieving optimal or near-optimal performance on a wide range of tasks while eliminating the need for manual selection and customization of prompts tailored to specific problems. Furthermore, we propose MeMo (Mental Models), an innovative prompting method that is simple-designed yet effectively fulfills the criteria of generalist prompting. MeMo distills the cores of various prompting methods into individual mental models and allows LLMs to autonomously select the most suitable mental models for the problem, achieving or being near to the state-of-the-art results on diverse tasks such as STEM, logical reasoning, and commonsense reasoning in zero-shot settings. We hope that the insights presented herein will stimulate further exploration of generalist prompting methods for LLMs.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How can large - language models (LLMs) achieve optimal or near - optimal performance on a wide range of tasks without customizing prompt words for specific tasks? Specifically, existing prompting methods either rely on task - specific examples that require certain domain knowledge, or are simply designed but only perform well on a few types of tasks. Therefore, researchers hope to introduce the concept of generalist prompting, enabling LLMs to perform well on multiple tasks without manually selecting and customizing prompt words for specific problems.
To achieve this goal, the author proposes MeMo (Mental Models), an innovative prompting method aimed at simplifying the design while effectively meeting the criteria of generalist prompting. MeMo refines the cores of various prompting methods into independent mental models, enabling LLMs to autonomously select the mental model that best suits the problem, thereby achieving or approaching state - of - the - art results in zero - sample settings across diverse tasks such as STEM, logical reasoning, and common - sense reasoning.
### Main contributions of the paper
1. **Introducing the concept of generalist prompting**: Proposing a new prompting method - generalist prompting, which aims to make LLMs perform excellently on multiple tasks.
2. **Proposing the MeMo method**: Developing MeMo, a mental - model - based prompting method that enables LLMs to autonomously select the mental model that best suits the problem.
3. **Empirical verification**: Verifying through experiments the superior performance of MeMo on tasks such as logical reasoning, STEM, and common - sense reasoning, demonstrating its wide applicability and high efficiency.
### Specific problem description
- **Existing problems**: Existing prompting methods either rely on a small number of task - specific examples and require certain domain knowledge, or are simply designed but only perform well on a few tasks.
- **Solution**: By introducing the concept of generalist prompting and the MeMo method, LLMs can achieve optimal or near - optimal performance on multiple tasks without manually selecting and customizing prompt words.
### Experimental results
- **Logical reasoning**: MeMo performs excellently on the StrategyQA and FOLIO datasets, significantly outperforming other prompting methods.
- **STEM fields**: In multiple - choice tests in fields such as computer science, mathematics, and electrical engineering, MeMo also achieves the best or near - best results.
- **Common - sense reasoning**: In causal relationship and rhetorical identification tasks, MeMo also performs excellently, demonstrating its wide applicability in different tasks.
Through these improvements, MeMo not only improves the performance of LLMs on various tasks but also reduces the need for human intervention, demonstrating its great potential as a generalist prompting method.