Abstract:As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to hallucinate generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people's lives. The journey toward widespread adoption of LLMs in practical settings heavily relies on addressing and mitigating hallucinations. Unlike traditional AI systems focused on limited tasks, LLMs have been exposed to vast amounts of online text data during training. While this allows them to display impressive language fluency, it also means they are capable of extrapolating information from the biases in training data, misinterpreting ambiguous prompts, or modifying the information to align superficially with the input. This becomes hugely alarming when we rely on language generation capabilities for sensitive applications, such as summarizing medical records, financial analysis reports, etc. This paper presents a comprehensive survey of over 32 techniques developed to mitigate hallucination in LLMs. Notable among these are Retrieval Augmented Generation (Lewis et al, 2021), Knowledge Retrieval (Varshney et al,2023), CoNLI (Lei et al, 2023), and CoVe (Dhuliawala et al, 2023). Furthermore, we introduce a detailed taxonomy categorizing these methods based on various parameters, such as dataset utilization, common tasks, feedback mechanisms, and retriever types. This classification helps distinguish the diverse approaches specifically designed to tackle hallucination issues in LLMs. Additionally, we analyze the challenges and limitations inherent in these techniques, providing a solid foundation for future research in addressing hallucinations and related phenomena within the realm of LLMs.

A Novel Approach to Eliminating Hallucinations in Large Language Model-Assisted Causal Discovery

Look Within, Why LLMs Hallucinate: A Causal Perspective

Mitigating Entity-Level Hallucination in Large Language Models

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Beyond Fine-Tuning: Effective Strategies for Mitigating Hallucinations in Large Language Models for Data Analytics

Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models

The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models

Banishing LLM Hallucinations Requires Rethinking Generalization

Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models

Hallucination Detection for Generative Large Language Models by Bayesian Sequential Estimation

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Analyzing and Mitigating Object Hallucination in Large Vision-Language Models

Zero-Resource Hallucination Prevention for Large Language Models

Towards Mitigating Hallucination in Large Language Models via Self-Reflection

Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations

Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models

Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning