Abstract:Large Language Models (LLMs) often hallucinate, producing unfaithful or factually incorrect outputs by misrepresenting the provided context or incorrectly recalling internal knowledge. Recent studies have identified specific attention heads within the Transformer architecture, known as retrieval heads, responsible for extracting relevant contextual information. We hypothesise that masking these retrieval heads can induce hallucinations and that contrasting the outputs of the base LLM and the masked LLM can reduce hallucinations. To this end, we propose Decoding by Contrasting Retrieval Heads (DeCoRe), a novel training-free decoding strategy that amplifies information found in the context and model parameters. DeCoRe mitigates potentially hallucinated responses by dynamically contrasting the outputs of the base LLM and the masked LLM, using conditional entropy as a guide. Our extensive experiments confirm that DeCoRe significantly improves performance on tasks requiring high contextual faithfulness, such as summarisation (XSum by 18.6%), instruction following (MemoTrap by 10.9%), and open-book question answering (NQ-Open by 2.4% and NQ-Swap by 5.5%).

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper aims to solve the hallucination problem that often occurs when large - language models (LLMs) generate text. Specifically, LLMs sometimes generate untrue or factually incorrect content, which either does not match the provided context or misremembers internal knowledge. The hallucination problem seriously affects the reliability of LLMs, especially in applications in high - risk fields such as clinical decision - making or legal reasoning. ### Overview of the solution To solve this problem, the author proposes a new decoding strategy named **Decoding by Contrasting Retrieval Heads (DeCoRe)**. The main features of this strategy are as follows: 1. **Identifying retrieval heads**: The author finds that certain attention heads (called "retrieval heads") in the Transformer architecture are responsible for extracting relevant information from the given context. By analyzing the behavior of these retrieval heads, the reasons for LLMs to produce hallucinations can be better understood. 2. **Contrastive decoding**: The DeCoRe method reduces hallucinations by contrasting the outputs of the base LLM and the LLM with masked retrieval heads. Specifically, this method uses conditional entropy as a guide to dynamically adjust the contrast strength, thereby amplifying the information in the context and model parameters. 3. **Dynamically adjusting the contrast strength**: In order to more effectively control the contrastive decoding process, DeCoRe introduces a dynamic adjustment mechanism based on conditional entropy. Conditional entropy reflects the model's uncertainty about the next token. When the conditional entropy is high, the contrast strength will be increased, thereby reducing the potential generation of hallucinations. ### Experimental results The author verifies the effectiveness of DeCoRe through a series of experiments. The experimental results show that DeCoRe significantly improves performance in tasks that require high context fidelity, such as: - **Abstract generation** (XSum dataset): The performance is improved by 18.6%. - **Instruction following** (MemoTrap dataset): The performance is improved by 10.9%. - **Open - book question answering** (NQ - Open and NQ - Swap datasets): The performance is improved by 2.4% and 5.5% respectively. In addition, DeCoRe also performs well in factual recall tasks. For example, in the TriviaQA and PopQA datasets, DeCoRe significantly improves the accuracy of the model. ### Conclusion By contrasting the outputs of the base LLM and the LLM with masked retrieval heads, DeCoRe effectively reduces the generation of hallucinations and improves the performance of LLMs in multiple tasks. This method not only performs excellently in tasks that require context fidelity, but also shows significant advantages in factual recall tasks.

DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations

MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

Delve into Visual Contrastive Decoding for Hallucination Mitigation of Large Vision-Language Models

Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused

CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models

Alleviating Hallucinations of Large Language Models through Induced Hallucinations

ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Models

Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding

Mitigating Hallucination in Visual-Language Models via Re-Balancing Contrastive Decoding

Embedding and Gradient Say Wrong: A White-Box Method for Hallucination Detection

DecoPrompt : Decoding Prompts Reduces Hallucinations when Large Language Models Meet False Premises

Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

DOPRA: Decoding Over-accumulation Penalization and Re-allocation in Specific Weighting Layer

Mitigating Hallucinations in Large Vision-Language Models (LVLMs) via Language-Contrastive Decoding (LCD)

Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization

HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding