Zero-Resource Hallucination Prevention for Large Language Models

Junyu Luo,Cao Xiao,Fenglong Ma

2023-10-08

Abstract:The prevalent use of large language models (LLMs) in various domains has drawn attention to the issue of "hallucination," which refers to instances where LLMs generate factually inaccurate or ungrounded information. Existing techniques for hallucination detection in language assistants rely on intricate fuzzy, specific free-language-based chain of thought (CoT) techniques or parameter-based methods that suffer from interpretability issues. Additionally, the methods that identify hallucinations post-generation could not prevent their occurrence and suffer from inconsistent performance due to the influence of the instruction format and model style. In this paper, we introduce a novel pre-detection self-evaluation technique, referred to as SELF-FAMILIARITY, which focuses on evaluating the model's familiarity with the concepts present in the input instruction and withholding the generation of response in case of unfamiliar concepts. This approach emulates the human ability to refrain from responding to unfamiliar topics, thus reducing hallucinations. We validate SELF-FAMILIARITY across four different large language models, demonstrating consistently superior performance compared to existing techniques. Our findings propose a significant shift towards preemptive strategies for hallucination mitigation in LLM assistants, promising improvements in reliability, applicability, and interpretability.

Computation and Language

What problem does this paper attempt to address?

The paper aims to address the issue of "hallucinations" produced by large language models (LLMs) in various applications, where the models generate inaccurate or unfounded information. Existing hallucination detection techniques rely on complex fuzzy logic, specific free language chain-of-thought (CoT) techniques, and parameter-based methods, which have interpretability issues. Moreover, current methods can only identify hallucinated information post-generation and cannot prevent its occurrence. Their performance is also unstable due to the influence of instruction format and model style. Therefore, the paper proposes a new pre-detection self-assessment technique called SELF-FAMILIARITY, which reduces hallucinations by evaluating the model's familiarity with the concepts in the input instructions and preventing response generation when encountering unfamiliar concepts. This approach simulates the human ability to avoid discussing unfamiliar topics, thereby reducing the occurrence of hallucinations. The research results show that SELF-FAMILIARITY outperforms existing techniques on 4 different large language models, demonstrating its potential in improving reliability, applicability, and interpretability.

Zero-Resource Hallucination Prevention for Large Language Models

Towards Mitigating Hallucination in Large Language Models via Self-Reflection

AutoHall: Automated Hallucination Dataset Generation for Large Language Models

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models

Hallucination Detection and Hallucination Mitigation: An Investigation

Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

Prompt-Guided Internal States for Hallucination Detection of Large Language Models

A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models

Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models

InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers

Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models

Mitigating Entity-Level Hallucination in Large Language Models

Cost-Effective Hallucination Detection for LLMs

Evaluation and Analysis of Hallucination in Large Vision-Language Models

Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach

Hallucination of Multimodal Large Language Models: A Survey