Abstract:Large language models (LLMs) have attracted significant attention for their remarkable abilities in various natural language processing tasks, but they suffer from hallucinations that will cause performance degradation. One promising solution to improve the LLMs' performance is to ask LLMs to revise their answer after generation, a technique known as self-correction. Among the two types of self-correction, intrinsic self-correction is considered a promising direction because it does not utilize external knowledge. However, recent works doubt the validity of LLM's ability to conduct intrinsic self-correction. In this paper, we present a novel perspective on the intrinsic self-correction capabilities of LLMs through theoretical analyses and empirical experiments. In addition, we identify two critical factors for successful self-correction: zero temperature and fair prompts. Leveraging these factors, we demonstrate that intrinsic self-correction ability is exhibited across multiple existing LLMs. Our findings offer insights into the fundamental theories underlying the self-correction behavior of LLMs and remark on the importance of unbiased prompts and zero temperature settings in harnessing their full potential.

What problem does this paper attempt to address?

This paper attempts to address the "hallucination" problem that large language models (LLMs) encounter when generating answers for natural language processing tasks. Specifically, the paper explores whether LLMs possess intrinsic self-correction (ISC) capabilities and how this ability can be leveraged to improve LLM performance. ### Background and Issues 1. **Hallucination Problem in LLMs**: - LLMs sometimes generate unreasonable or contextually irrelevant answers when producing text, a phenomenon known as "hallucination." - Hallucination leads to a decline in LLM performance, especially in tasks requiring precise answers. 2. **Two Types of Self-Correction**: - **Extrinsic Self-Correction (ESC)**: Utilizes external knowledge for correction. - **Intrinsic Self-Correction (ISC)**: Relies solely on the model's own knowledge for correction. 3. **Controversies in Existing Research**: - Recent studies suggest that LLMs may not possess effective ISC capabilities. These studies argue that if LLMs could improve performance through ISC, they should be able to provide the correct answer on the initial attempt. - These studies raise several concerns, including external feedback, early stopping criteria, and unfair prompts. ### Main Contributions of the Paper 1. **Theoretical Analysis and Experimental Evidence**: - Through theoretical analysis and experimental validation, the paper demonstrates that LLMs do indeed possess ISC capabilities. - The paper points out that the reason LLMs fail to provide the correct answer on the initial attempt is due to their inherent hallucination characteristics. 2. **Key Factors**: - **Temperature Setting**: The paper finds that using zero temperature can maximize the effect of ISC. Non-zero temperature increases decision randomness, thereby reducing ISC effectiveness. - **Fair Prompts**: The paper emphasizes that using fair and unbiased prompts is crucial for achieving effective ISC. Fair prompts do not directly or indirectly influence LLMs to change or maintain their initial answers. 3. **Experimental Design**: - The paper uses two datasets (CommonSenseQA and GSM8K) to evaluate the ISC capabilities of different models. - Experimental results show that different models have varying sensitivities to temperature and prompts, but overall, zero temperature and fair prompts significantly enhance ISC effectiveness. ### Conclusion - The paper demonstrates through theoretical analysis and experiments that LLMs possess ISC capabilities and identifies key factors (zero temperature and fair prompts) for achieving effective ISC. - These findings contribute to a better understanding of the self-correction mechanisms in LLMs and provide guidance for future research. ### Example Illustration The paper illustrates the impact of biased and fair prompts on LLM outputs through a specific example. For instance, in a three-stage self-correction process, biased prompts may lead LLMs from the correct answer to an incorrect one, while fair prompts help maintain the correct answer. ### Summary This paper aims to address the hallucination problem in LLMs when generating answers and demonstrates through theory and experiments that LLMs possess intrinsic self-correction capabilities. The paper identifies key factors for achieving effective ISC, providing new perspectives and methods for improving LLM performance.

Large Language Models have Intrinsic Self-Correction Ability

Large Language Models Cannot Self-Correct Reasoning Yet

Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models

Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept

Is Moral Self-correction An Innate Capability of Large Language Models? A Mechanistic Analysis to Self-correction

Small Language Model Can Self-correct

Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies

Smaller Large Language Models Can Do Moral Self-Correction

When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs

Intrinsic Self-correction for Enhanced Morality: An Analysis of Internal Mechanisms and the Superficial Hypothesis

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

A Theoretical Understanding of Self-Correction through In-context Alignment

Large Language Models Can Self-Improve in Long-context Reasoning

S^3c-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners

Small Language Models Need Strong Verifiers to Self-Correct Reasoning

Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks

Large Language Models Can Self-Correct with Key Condition Verification

Learning From Correctness Without Prompting Makes LLM Efficient Reasoner

N-Critics: Self-Refinement of Large Language Models with Ensemble of Critics

Internal Consistency and Self-Feedback in Large Language Models: A Survey