Abstract:Large language models (LLMs) have transformed the landscape of language processing, yet struggle with significant challenges in terms of security, privacy, and the generation of seemingly coherent but factually inaccurate outputs, commonly referred to as hallucinations. Among these challenges, one particularly pressing issue is Fact-Conflicting Hallucination (FCH), where LLMs generate content that directly contradicts established facts. Tackling FCH poses a formidable task due to two primary obstacles: Firstly, automating the construction and updating of benchmark datasets is challenging, as current methods rely on static benchmarks that don't cover the diverse range of FCH scenarios. Secondly, validating LLM outputs' reasoning process is inherently complex, especially with intricate logical relations involved. In addressing these obstacles, we propose an innovative approach leveraging logic programming to enhance metamorphic testing for detecting Fact-Conflicting Hallucinations (FCH). Our method gathers data from sources like Wikipedia, expands it with logical reasoning to create diverse test cases, assesses LLMs through structured prompts, and validates their coherence using semantic-aware assessment mechanisms. Our method generates test cases and detects hallucinations across six different LLMs spanning nine domains, revealing hallucination rates ranging from 24.7% to 59.8%. Key observations indicate that LLMs encounter challenges, particularly with temporal concepts, handling out-of-distribution knowledge, and exhibiting deficiencies in logical reasoning capabilities. The outcomes underscore the efficacy of logic-based test cases generated by our tool in both triggering and identifying hallucinations. These findings underscore the imperative for ongoing collaborative endeavors within the community to detect and address LLM hallucinations.

PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models

Fine-grained Hallucination Detection and Editing for Language Models

FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning

Mitigating Large Language Model Hallucination with Faithful Finetuning

Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites

VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models

Hallucination Detection and Hallucination Mitigation: An Investigation

Halu-J: Critique-Based Hallucination Judge

HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

Drowzee: Metamorphic Testing for Fact-Conflicting Hallucination Detection in Large Language Models

Cost-Effective Hallucination Detection for LLMs

Prompt-Guided Internal States for Hallucination Detection of Large Language Models

Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions

Woodpecker: Hallucination Correction for Multimodal Large Language Models

EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models

Towards Detecting LLMs Hallucination via Markov Chain-based Multi-agent Debate Framework

Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models