Internal Consistency and Self-Feedback in Large Language Models: A Survey

Xun Liang,Shichao Song,Zifan Zheng,Hanyu Wang,Qingchen Yu,Xunkai Li,Rong-Hua Li,Yi Wang,Zhonghao Wang,Feiyu Xiong,Zhiyu Li
2024-09-18
Abstract:Large language models (LLMs) often exhibit deficient reasoning or generate hallucinations. To address these, studies prefixed with "Self-" such as Self-Consistency, Self-Improve, and Self-Refine have been initiated. They share a commonality: involving LLMs evaluating and updating themselves. Nonetheless, these efforts lack a unified perspective on summarization, as existing surveys predominantly focus on categorization. In this paper, we use a unified perspective of internal consistency, offering explanations for reasoning deficiencies and hallucinations. Internal consistency refers to the consistency in expressions among LLMs' latent, decoding, or response layers based on sampling methodologies. Then, we introduce an effective theoretical framework capable of mining internal consistency, named Self-Feedback. This framework consists of two modules: Self-Evaluation and Self-Update. The former captures internal consistency signals, while the latter leverages the signals to enhance either the model's response or the model itself. This framework has been employed in numerous studies. We systematically classify these studies by tasks and lines of work; summarize relevant evaluation methods and benchmarks; and delve into the concern, "Does Self-Feedback Really Work?" We also propose several critical viewpoints, including the "Hourglass Evolution of Internal Consistency", "Consistency Is (Almost) Correctness" hypothesis, and "The Paradox of Latent and Explicit Reasoning". The relevant resources are open-sourced at <a class="link-external link-https" href="https://github.com/IAAR-Shanghai/ICSFSurvey" rel="external noopener nofollow">this https URL</a>.
Computation and Language
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the deficiencies of large language models (LLMs) in terms of reasoning ability and generation consistency. Specifically, the paper focuses on the following core issues: 1. **Insufficient Reasoning Ability**: Large language models often exhibit logical inconsistencies or reasoning errors when handling logical reasoning tasks. For example, when answering mathematical questions, the model may provide incorrect answers. 2. **Generation Hallucinations**: Large language models sometimes generate content that is inconsistent with facts or self-contradictory, a phenomenon known as "hallucination." These hallucinations can affect the reliability and credibility of the model. 3. **Lack of Internal Consistency**: When generating responses, large language models may produce completely different answers due to the influence of random sampling methods (such as Top-k, Top-p, beam search, etc.). This issue of internal inconsistency is particularly evident in the deeper latent layers, where different attention heads may lead to different outputs. To address these issues, the paper proposes a unified perspective—**internal consistency**, and introduces an effective theoretical framework—**Self-Feedback**. This framework includes two modules: **Self-Evaluation** and **Self-Update**. The Self-Evaluation module captures internal consistency signals, while the Self-Update module uses these signals to enhance the model's responses or improve the model itself. By systematically categorizing related research, summarizing evaluation methods and benchmarks, and exploring the question of "whether self-feedback is truly effective," the paper aims to provide theoretical and practical guidance for improving the internal consistency of large language models.