How Interpretable are Reasoning Explanations from Prompting Large Language Models?

Wei Jie Yeo,Ranjan Satapathy,Rick Siow Mong Goh,Erik Cambria

2024-04-01

Abstract:Prompt Engineering has garnered significant attention for enhancing the performance of large language models across a multitude of tasks. Techniques such as the Chain-of-Thought not only bolster task performance but also delineate a clear trajectory of reasoning steps, offering a tangible form of explanation for the audience. Prior works on interpretability assess the reasoning chains yielded by Chain-of-Thought solely along a singular axis, namely faithfulness. We present a comprehensive and multifaceted evaluation of interpretability, examining not only faithfulness but also robustness and utility across multiple commonsense reasoning benchmarks. Likewise, our investigation is not confined to a single prompting technique; it expansively covers a multitude of prevalent prompting techniques employed in large language models, thereby ensuring a wide-ranging and exhaustive evaluation. In addition, we introduce a simple interpretability alignment technique, termed Self-Entailment-Alignment Chain-of-thought, that yields more than 70\% improvements across multiple dimensions of interpretability. Code is available at

Computer Science

What problem does this paper attempt to address?

The paper aims to address the issue of multidimensional evaluation of explanations generated by large language models (LLMs) in terms of interpretability and proposes an improved method to enhance the quality of explanations. Specifically: 1. **Multidimensional Interpretability Evaluation**: The paper proposes a comprehensive and multifaceted interpretability evaluation framework that not only assesses the faithfulness of explanations but also examines their robustness and utility. This evaluation framework is applied to multiple common-sense reasoning benchmark datasets. 2. **Proposing the SEA-CoT Method**: The authors propose a new method called Self-Entailment-Alignment Chain-of-thought (SEA-CoT), which improves the quality of explanations through consistency alignment. Compared to the existing Self-Consistent CoT, SEA-CoT adds considerations of entailment and overlap with supporting context during the explanation selection phase. 3. **Experimental Validation**: Extensive experimental validation on different prompting techniques (such as CoT, SC-CoT, etc.) shows that SEA-CoT significantly enhances the quality of interpretability of explanations across multiple benchmark datasets, with particularly outstanding performance on the OpenBookQA dataset. In summary, the paper aims to improve the quality of explanations generated by large language models and their reliability in practical applications by introducing the SEA-CoT method and conducting comprehensive interpretability evaluations.

How Interpretable are Reasoning Explanations from Prompting Large Language Models?

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning

Explanation Selection Using Unlabeled Data for Chain-of-Thought Prompting

Large Language Models Cannot Explain Themselves

Self-Polish: Enhance Reasoning in Large Language Models Via Problem Refinement.

PromptExp: Multi-granularity Prompt Explanation of Large Language Models

Advances in reasoning by prompting large language models: A survey

Reasoning with Large Language Models, a Survey

OPT-R: Exploring the Role of Explanations in Finetuning and Prompting for Reasoning Skills of Large Language Models

Chain-of-Thought in Large Language Models: Decoding, Projection, and Activation

What Makes Large Language Models Reason in (Multi-Turn) Code Generation?

Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting

Chain-of-Thought Reasoning Without Prompting

Post Hoc Explanations of Language Models Can Improve Language Models

Evaluating the Reliability of Self-Explanations in Large Language Models

XPrompt:Explaining Large Language Model's Generation via Joint Prompt Attribution

Logic-of-Thought: Injecting Logic into Contexts for Full Reasoning in Large Language Models

On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models