Abstract:As Large Language Models (LLMs) continue to advance in their ability to write human-like text, a key challenge remains around their tendency to hallucinate generating content that appears factual but is ungrounded. This issue of hallucination is arguably the biggest hindrance to safely deploying these powerful LLMs into real-world production systems that impact people's lives. The journey toward widespread adoption of LLMs in practical settings heavily relies on addressing and mitigating hallucinations. Unlike traditional AI systems focused on limited tasks, LLMs have been exposed to vast amounts of online text data during training. While this allows them to display impressive language fluency, it also means they are capable of extrapolating information from the biases in training data, misinterpreting ambiguous prompts, or modifying the information to align superficially with the input. This becomes hugely alarming when we rely on language generation capabilities for sensitive applications, such as summarizing medical records, financial analysis reports, etc. This paper presents a comprehensive survey of over 32 techniques developed to mitigate hallucination in LLMs. Notable among these are Retrieval Augmented Generation (Lewis et al, 2021), Knowledge Retrieval (Varshney et al,2023), CoNLI (Lei et al, 2023), and CoVe (Dhuliawala et al, 2023). Furthermore, we introduce a detailed taxonomy categorizing these methods based on various parameters, such as dataset utilization, common tasks, feedback mechanisms, and retriever types. This classification helps distinguish the diverse approaches specifically designed to tackle hallucination issues in LLMs. Additionally, we analyze the challenges and limitations inherent in these techniques, providing a solid foundation for future research in addressing hallucinations and related phenomena within the realm of LLMs.

Don't Believe Everything You Read: Enhancing Summarization Interpretability through Automatic Identification of Hallucinations in Large Language Models

Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends

From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models

Hallucination Diversity-Aware Active Learning for Text Summarization

Beyond Fine-Tuning: Effective Strategies for Mitigating Hallucinations in Large Language Models for Data Analytics

The Troubling Emergence of Hallucination in Large Language Models -- An Extensive Definition, Quantification, and Prescriptive Remediations

Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Banishing LLM Hallucinations Requires Rethinking Generalization

Sources of Hallucination by Large Language Models on Inference Tasks

Analyzing and Mitigating Object Hallucination in Large Vision-Language Models

Fine-grained Hallucination Detection and Editing for Language Models

Hallucination of Multimodal Large Language Models: A Survey

Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

Detecting and Mitigating Hallucinations in Multilingual Summarisation

Towards Mitigating Hallucination in Large Language Models via Self-Reflection

Comprehending and Reducing LLM Hallucinations