Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations

Deren Lei,Yaxi Li,Mengya Hu,Mingyu Wang,Vincent Yun,Emily Ching,Eslam Kamal
2023-10-10
Abstract:Large language models (LLMs) can generate fluent natural language texts when given relevant documents as background context. This ability has attracted considerable interest in developing industry applications of LLMs. However, LLMs are prone to generate hallucinations that are not supported by the provided sources. In this paper, we propose a hierarchical framework to detect and mitigate such ungrounded hallucination. Our framework uses Chain of Natural Language Inference (CoNLI) for hallucination detection and hallucination reduction via post-editing. Our approach achieves state-of-the-art performance on hallucination detection and enhances text quality through rewrite, using LLMs without any fine-tuning or domain-specific prompt engineering. We show that this simple plug-and-play framework can serve as an effective choice for hallucination detection and reduction, achieving competitive performance across various contexts.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper aims to address the issue of hallucinations in large language models (LLMs) when generating natural language text. Specifically, the paper focuses on **ungrounded hallucinations**, which are instances where the generated sentences contradict or cannot be verified against the provided background text. #### Main Contributions 1. **Proposed Framework**: The paper introduces a hierarchical framework called the Chain of Natural Language Inference (CoNLI) for detecting and reducing ungrounded hallucinations. 2. **Detection and Correction**: This framework identifies hallucinations through sentence-level and entity-level detection and corrects them via post-editing. 3. **No Fine-Tuning Required**: The entire approach does not require any fine-tuning of the base language model or domain-specific prompt engineering, making it highly generalizable. 4. **Performance Improvement**: Experimental results show that CoNLI performs well across various benchmarks, reducing hallucinations while improving text quality. #### Method Overview - **Hierarchical Detection**: Sentence-level detection is performed first, followed by entity-level detection for sentences where no hallucinations were initially found. - **Post-Editing**: Based on the detection results, simple post-editing techniques are used to correct hallucinations while maintaining the basic structure of the original response. Through this approach, the paper provides a simple yet effective plug-in framework that can effectively reduce ungrounded hallucinations in different contexts.