Large Language Models have Intrinsic Self-Correction Ability

Dancheng Liu,Amir Nassereldine,Ziming Yang,Chenhui Xu,Yuting Hu,Jiajie Li,Utkarsh Kumar,Changjae Lee,Jinjun Xiong
2024-06-22
Abstract:Large language models (LLMs) have attracted significant attention for their remarkable abilities in various natural language processing tasks, but they suffer from hallucinations that will cause performance degradation. One promising solution to improve the LLMs' performance is to ask LLMs to revise their answer after generation, a technique known as self-correction. Among the two types of self-correction, intrinsic self-correction is considered a promising direction because it does not utilize external knowledge. However, recent works doubt the validity of LLM's ability to conduct intrinsic self-correction. In this paper, we present a novel perspective on the intrinsic self-correction capabilities of LLMs through theoretical analyses and empirical experiments. In addition, we identify two critical factors for successful self-correction: zero temperature and fair prompts. Leveraging these factors, we demonstrate that intrinsic self-correction ability is exhibited across multiple existing LLMs. Our findings offer insights into the fundamental theories underlying the self-correction behavior of LLMs and remark on the importance of unbiased prompts and zero temperature settings in harnessing their full potential.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
This paper attempts to address the "hallucination" problem that large language models (LLMs) encounter when generating answers for natural language processing tasks. Specifically, the paper explores whether LLMs possess intrinsic self-correction (ISC) capabilities and how this ability can be leveraged to improve LLM performance. ### Background and Issues 1. **Hallucination Problem in LLMs**: - LLMs sometimes generate unreasonable or contextually irrelevant answers when producing text, a phenomenon known as "hallucination." - Hallucination leads to a decline in LLM performance, especially in tasks requiring precise answers. 2. **Two Types of Self-Correction**: - **Extrinsic Self-Correction (ESC)**: Utilizes external knowledge for correction. - **Intrinsic Self-Correction (ISC)**: Relies solely on the model's own knowledge for correction. 3. **Controversies in Existing Research**: - Recent studies suggest that LLMs may not possess effective ISC capabilities. These studies argue that if LLMs could improve performance through ISC, they should be able to provide the correct answer on the initial attempt. - These studies raise several concerns, including external feedback, early stopping criteria, and unfair prompts. ### Main Contributions of the Paper 1. **Theoretical Analysis and Experimental Evidence**: - Through theoretical analysis and experimental validation, the paper demonstrates that LLMs do indeed possess ISC capabilities. - The paper points out that the reason LLMs fail to provide the correct answer on the initial attempt is due to their inherent hallucination characteristics. 2. **Key Factors**: - **Temperature Setting**: The paper finds that using zero temperature can maximize the effect of ISC. Non-zero temperature increases decision randomness, thereby reducing ISC effectiveness. - **Fair Prompts**: The paper emphasizes that using fair and unbiased prompts is crucial for achieving effective ISC. Fair prompts do not directly or indirectly influence LLMs to change or maintain their initial answers. 3. **Experimental Design**: - The paper uses two datasets (CommonSenseQA and GSM8K) to evaluate the ISC capabilities of different models. - Experimental results show that different models have varying sensitivities to temperature and prompts, but overall, zero temperature and fair prompts significantly enhance ISC effectiveness. ### Conclusion - The paper demonstrates through theoretical analysis and experiments that LLMs possess ISC capabilities and identifies key factors (zero temperature and fair prompts) for achieving effective ISC. - These findings contribute to a better understanding of the self-correction mechanisms in LLMs and provide guidance for future research. ### Example Illustration The paper illustrates the impact of biased and fair prompts on LLM outputs through a specific example. For instance, in a three-stage self-correction process, biased prompts may lead LLMs from the correct answer to an incorrect one, while fair prompts help maintain the correct answer. ### Summary This paper aims to address the hallucination problem in LLMs when generating answers and demonstrates through theory and experiments that LLMs possess intrinsic self-correction capabilities. The paper identifies key factors for achieving effective ISC, providing new perspectives and methods for improving LLM performance.