Reasoning Beyond Bias: A Study on Counterfactual Prompting and Chain of Thought Reasoning

Kyle Moore,Jesse Roberts,Thao Pham,Douglas Fisher
2024-09-06
Abstract:Language models are known to absorb biases from their training data, leading to predictions driven by statistical regularities rather than semantic relevance. We investigate the impact of these biases on answer choice preferences in the Massive Multi-Task Language Understanding (MMLU) task. Our findings reveal that differences in learned regularities across answer options are predictive of model preferences and mirror human test-taking strategies. To address this issue, we introduce two novel methods: Counterfactual Prompting with Chain of Thought (CoT) and Counterfactual Prompting with Agnostically Primed CoT (APriCoT). We demonstrate that while Counterfactual Prompting with CoT alone is insufficient to mitigate bias, our novel Primed Counterfactual Prompting with CoT approach effectively reduces the influence of base-rate probabilities while improving overall accuracy. Our results suggest that mitigating bias requires a "System-2" like process and that CoT reasoning is susceptible to confirmation bias under some prompting methodologies. Our contributions offer practical solutions for developing more robust and fair language models.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to solve the Base - Rate Probability (BRP) problem in large - language models (LLMs). Specifically, the author focuses on how to reduce the bias in these models' answer - selection preferences, especially in the Massive Multi - Task Language Understanding (MMLU) task. The following are the main problems and solutions in the paper: ### Main Problems 1. **The Influence of the Base - Rate Probability Effect**: LLMs absorb biases from the training data, causing their predictions to rely more on statistical regularities rather than semantic relevance. This bias can affect the model's preference for answer options. 2. **The Limitations of Chain of Thought (CoT)**: Although the traditional Chain of Thought method can promote more complex reasoning, in some cases it can exacerbate the base - rate probability effect, similar to confirmation bias. ### Solutions To address the above problems, the author proposes two new methods: 1. **Combining Counterfactual Prompting with Chain of Thought (CF + CoT)**: This method attempts to mitigate the base - rate probability effect by introducing counterfactual prompts, but the experimental results show that it actually amplifies this effect. 2. **Combining Counterfactual Prompting with Agnostically Primed Chain of Thought (APriCoT)**: This is an improved method. By independently evaluating each answer option and requiring the model to provide coherent comments, it reduces bias and improves accuracy. ### Experimental Results - **CF + CoT**: Experiments show that CF + CoT not only fails to effectively mitigate the base - rate probability effect, but also makes it more significant, causing the model's selection distribution to deviate from the true answer distribution. - **APriCoT**: In contrast, APriCoT effectively reduces the influence of the base - rate probability effect, making the model's selection distribution closer to the true answer distribution and significantly improving the model's accuracy. ### Conclusion By introducing the APriCoT method, the author successfully alleviates the influence of the base - rate probability effect on the behavior of LLMs and improves the performance of the model in the MMLU task. This provides a practical solution for developing more robust and fair language models. ### Formula Representation The base - rate probability effect involved in the paper can be represented by the following formula: \[ P(\text{Choice}|\text{Context})=\frac{P(\text{Context}|\text{Choice})\cdot P(\text{Choice})}{P(\text{Context})} \] where \( P(\text{Choice}) \) represents the base - rate probability (BRP), \( P(\text{Context}|\text{Choice}) \) represents the probability of the context given the answer option, and \( P(\text{Context}) \) is the total probability of the context. In this way, the paper shows how to improve the prompt strategy to mitigate the base - rate probability effect in LLMs, thereby improving the accuracy and fairness of the model.