Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

Liangming Pan,Michael Saxon,Wenda Xu,Deepak Nathani,Xinyi Wang,William Yang Wang

2023-08-30

Abstract:Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks. However, their efficacy is undermined by undesired and inconsistent behaviors, including hallucination, unfaithful reasoning, and toxic content. A promising approach to rectify these flaws is self-correction, where the LLM itself is prompted or guided to fix problems in its own output. Techniques leveraging automated feedback -- either produced by the LLM itself or some external system -- are of particular interest as they are a promising way to make LLM-based solutions more practical and deployable with minimal human feedback. This paper presents a comprehensive review of this emerging class of techniques. We analyze and taxonomize a wide array of recent work utilizing these strategies, including training-time, generation-time, and post-hoc correction. We also summarize the major applications of this strategy and conclude by discussing future directions and challenges.

Computation and Language,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The problem this paper attempts to address is the poor and inconsistent behavior exhibited by large language models (LLMs) when performing various natural language processing tasks, such as generating hallucinations, inaccurate reasoning, and producing harmful content. These undesirable behaviors undermine the effectiveness of LLMs, affecting their reliability and trustworthiness in practical applications. To address these issues, the paper explores a method called self-correction, which involves guiding or prompting the LLM to correct the problems in its output by itself. In particular, the researchers are interested in techniques that utilize automated feedback (generated by the LLM itself or other external systems), as these techniques can make LLM-based solutions more practical and deployable with minimal human intervention. The main contribution of the paper is to provide a comprehensive review of this emerging field, analyzing and categorizing various recent works that use these strategies, including correction methods during training, generation, and post-generation. It also summarizes the main application scenarios of these strategies and discusses future research directions and challenges. This not only helps to understand the current state of LLM self-correction techniques but also provides guidance for future research and development.

Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies

Large Language Models have Intrinsic Self-Correction Ability

Large Language Models Cannot Self-Correct Reasoning Yet

When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs

Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

Is Moral Self-correction An Innate Capability of Large Language Models? A Mechanistic Analysis to Self-correction

N-Critics: Self-Refinement of Large Language Models with Ensemble of Critics

On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept

A Theoretical Understanding of Self-Correction through In-context Alignment

Training Language Models to Self-Correct via Reinforcement Learning

On the Intersection of Self-Correction and Trust in Language Models

Smaller Large Language Models Can Do Moral Self-Correction

Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks

Rethinking the Roles of Large Language Models in Chinese Grammatical Error Correction

On the (In)Effectiveness of Large Language Models for Chinese Text Correction

CorrectionLM: Self-Corrections with SLM for Dialogue State Tracking

Small Language Models Need Strong Verifiers to Self-Correct Reasoning

Intrinsic Self-correction for Enhanced Morality: An Analysis of Internal Mechanisms and the Superficial Hypothesis

Small Language Model Can Self-correct