Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Yue Zhang,Yafu Li,Leyang Cui,Deng Cai,Lemao Liu,Tingchen Fu,Xinting Huang,Enbo Zhao,Yu Zhang,Yulong Chen,Longyue Wang,Anh Tuan Luu,Wei Bi,Freda Shi,Shuming Shi

2023-09-25

Abstract:While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge. This phenomenon poses a substantial challenge to the reliability of LLMs in real-world scenarios. In this paper, we survey recent efforts on the detection, explanation, and mitigation of hallucination, with an emphasis on the unique challenges posed by LLMs. We present taxonomies of the LLM hallucination phenomena and evaluation benchmarks, analyze existing approaches aiming at mitigating LLM hallucination, and discuss potential directions for future research.

Computation and Language,Artificial Intelligence,Computers and Society,Machine Learning

What problem does this paper attempt to address?

The paper primarily focuses on addressing the hallucination problem that occurs when large language models (LLMs) generate content. Specifically, LLMs sometimes produce content that is inconsistent with user input, contradicts previously generated content, or conflicts with known facts. This phenomenon seriously challenges the reliability of LLMs in practical applications. The paper categorizes hallucinations into three types: 1. **Input Conflict Hallucination**: The content generated by LLMs is inconsistent with the user's input. 2. **Context Conflict Hallucination**: The content generated by LLMs contradicts the information it previously generated. 3. **Fact Conflict Hallucination**: The content generated by LLMs is inconsistent with established world knowledge. The paper emphasizes that although the first two types of hallucinations have been widely studied in traditional natural language generation tasks, fact conflict hallucinations have become the current research focus due to their complexity and potential real-world harm. Additionally, the paper discusses unique challenges such as large-scale training data, the multifunctionality of LLMs, and the difficulty in detecting errors. It also proposes various benchmark methods for evaluating the hallucination phenomenon in LLMs and their construction methods.

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

A Survey on Hallucination in Large Vision-Language Models

Hallucination of Multimodal Large Language Models: A Survey

A Survey of Hallucination in Large Visual Language Models

Unravelling the Mysteries of Hallucination in Large Language Models: Strategies for Precision in Artificial Intelligence Language Generation

Cognitive Mirage: A Review of Hallucinations in Large Language Models

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models

Hallucination Detection and Hallucination Mitigation: An Investigation

The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Sources of Hallucination by Large Language Models on Inference Tasks

Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

On Large Language Models' Hallucination with Regard to Known Facts

AutoHall: Automated Hallucination Dataset Generation for Large Language Models