Abstract:Large language models demonstrated state-of-the-art results on various reasoning tasks when applying the chain-of-thought (CoT) prompting technique. CoT prompting guides the model into breaking tasks into a few intermediate steps and provides step-by-step demonstrations. However, solving complex reasoning tasks remains a challenge. In this paper, we propose a novel prompting strategy inspired by Dialectical Behavioral Therapy (DBT). DBT, a form of cognitive-behavioral therapy, aims to help individuals cope with stress by developing a system of reasoning. We applied DBT's basic concepts of shaping dialog to construct prompts and conducted experiments on different datasets and LLMs with various numbers of parameters. Our results show that prompts crafted with DBT techniques significantly improve results on smaller models, achieving a 7% increase in accuracy on the StrategyQA, 4.8% on Aqua dataset using 8b parameters model, and a 16.2% increase on the StrategyQA, 5.3% on GSM8K dataset with 14b parameters model.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to improve the performance of large - language models (LLMs) in complex reasoning tasks, especially in conceptual reasoning tasks, by improving the prompting technique.
Specifically, although the Chain - of - Thought (CoT) prompting technique has achieved remarkable results in various reasoning tasks, it still faces challenges in tasks that require conceptual reasoning such as abstract algebra and school mathematics. To solve this problem, the author proposes a new prompting strategy inspired by Dialectical Behavior Therapy (DBT).
### Main contributions:
1. **Introducing DBT prompting technique**: Use the core principles of DBT (such as wise mind, observing, describing, and effectiveness) to construct prompts to guide the model to handle reasoning tasks more effectively.
2. **Experimental verification**: Verify the effectiveness of the DBT prompting technique by conducting experiments on different datasets and LLMs with different numbers of parameters.
### Experimental results:
- On smaller models (such as the llama3 model with 8B parameters), after using the DBT prompting technique, the accuracy rate has been significantly improved. For example, it has increased by 7% on the StrategyQA dataset and 4.8% on the Aqua dataset.
- On larger models (such as the phi3:medium model with 14B parameters), the DBT prompting technique also performs well on some datasets. For example, it has increased by 5.4% on the GSM8K dataset and 16.2% on the StrategyQA dataset.
- For the largest model (such as GPT - 3.5 - turbo Instruct), the effect of the DBT prompting technique is not as good as the baseline method, which indicates that the larger model itself already has a relatively high initial accuracy rate.
### Conclusion:
The DBT prompting technique performs well on smaller LLMs and can significantly improve their performance in complex reasoning tasks. For larger models, although the effect is not obvious, the DBT prompting technique still has certain application potential. Future research can explore more variants of the DBT prompting technique and apply it to the fields of education and psychology.
### Formula example:
Since this article mainly involves natural language processing and reasoning tasks, it does not involve specific mathematical, physical or chemical formulas. If it is necessary to show formulas, Markdown format can be used, for example:
\[
\text{Accuracy}=\frac{\text{Number of Correct Predictions}}{\text{Total Number of Predictions}}
\]
This can ensure the correctness and readability of the formula.