Abstract:Large language models demonstrated state-of-the-art results on various reasoning tasks when applying the chain-of-thought (CoT) prompting technique. CoT prompting guides the model into breaking tasks into a few intermediate steps and provides step-by-step demonstrations. However, solving complex reasoning tasks remains a challenge. In this paper, we propose a novel prompting strategy inspired by Dialectical Behavioral Therapy (DBT). DBT, a form of cognitive-behavioral therapy, aims to help individuals cope with stress by developing a system of reasoning. We applied DBT's basic concepts of shaping dialog to construct prompts and conducted experiments on different datasets and LLMs with various numbers of parameters. Our results show that prompts crafted with DBT techniques significantly improve results on smaller models, achieving a 7% increase in accuracy on the StrategyQA, 4.8% on Aqua dataset using 8b parameters model, and a 16.2% increase on the StrategyQA, 5.3% on GSM8K dataset with 14b parameters model.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to improve the performance of large - language models (LLMs) in complex reasoning tasks, especially in conceptual reasoning tasks, by improving the prompting technique. Specifically, although the Chain - of - Thought (CoT) prompting technique has achieved remarkable results in various reasoning tasks, it still faces challenges in tasks that require conceptual reasoning such as abstract algebra and school mathematics. To solve this problem, the author proposes a new prompting strategy inspired by Dialectical Behavior Therapy (DBT). ### Main contributions: 1. **Introducing DBT prompting technique**: Use the core principles of DBT (such as wise mind, observing, describing, and effectiveness) to construct prompts to guide the model to handle reasoning tasks more effectively. 2. **Experimental verification**: Verify the effectiveness of the DBT prompting technique by conducting experiments on different datasets and LLMs with different numbers of parameters. ### Experimental results: - On smaller models (such as the llama3 model with 8B parameters), after using the DBT prompting technique, the accuracy rate has been significantly improved. For example, it has increased by 7% on the StrategyQA dataset and 4.8% on the Aqua dataset. - On larger models (such as the phi3:medium model with 14B parameters), the DBT prompting technique also performs well on some datasets. For example, it has increased by 5.4% on the GSM8K dataset and 16.2% on the StrategyQA dataset. - For the largest model (such as GPT - 3.5 - turbo Instruct), the effect of the DBT prompting technique is not as good as the baseline method, which indicates that the larger model itself already has a relatively high initial accuracy rate. ### Conclusion: The DBT prompting technique performs well on smaller LLMs and can significantly improve their performance in complex reasoning tasks. For larger models, although the effect is not obvious, the DBT prompting technique still has certain application potential. Future research can explore more variants of the DBT prompting technique and apply it to the fields of education and psychology. ### Formula example: Since this article mainly involves natural language processing and reasoning tasks, it does not involve specific mathematical, physical or chemical formulas. If it is necessary to show formulas, Markdown format can be used, for example: \[ \text{Accuracy}=\frac{\text{Number of Correct Predictions}}{\text{Total Number of Predictions}} \] This can ensure the correctness and readability of the formula.

Dialectical Behavior Therapy Approach to LLM Prompting

Unlocking Structured Thinking in Language Models with Cognitive Prompting

Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models

Code Prompting: a Neural Symbolic Method for Complex Reasoning in Large Language Models

Stress Testing Chain-of-Thought Prompting for Large Language Models

Question-Analysis Prompting Improves LLM Performance in Reasoning Tasks

R$^3$ Prompting: Review, Rephrase and Resolve for Chain-of-Thought Reasoning in Large Language Models under Noisy Context

Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

Chain-of-Thought Reasoning Without Prompting

Progressive-Hint Prompting Improves Reasoning in Large Language Models

Large Language Models are Contrastive Reasoners

Active Prompting with Chain-of-Thought for Large Language Models

Least-to-Most Prompting Enables Complex Reasoning in Large Language Models

Boosting Language Models Reasoning with Chain-of-Knowledge Prompting

Think Beyond Size: Adaptive Prompting for More Effective Reasoning

Metacognitive Prompting Improves Understanding in Large Language Models

Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models

ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting

On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models

Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models