Abstract:The development of Large Language Models (LLMs) has significantly advanced various AI applications in commercial and scientific research fields, such as scientific literature summarization, writing assistance, and knowledge graph construction. However, a significant challenge is the high risk of hallucination during LLM inference, which can lead to security concerns like factual inaccuracies, inconsistent information, and fabricated content. To tackle this issue, it is essential to develop effective methods for reducing hallucination while maintaining the original capabilities of the LLM. This paper introduces a novel approach called Iterative Model-level Contrastive Learning (Iter-AHMCL) to address hallucination. This method modifies the representation layers of pre-trained LLMs by using contrastive `positive' and `negative' models, trained on data with and without hallucinations. By leveraging the differences between these two models, we create a more straightforward pathway to eliminate hallucinations, and the iterative nature of contrastive learning further enhances performance. Experimental validation on four pre-trained foundation LLMs (LLaMA2, Alpaca, LLaMA3, and Qwen) finetuning with a specially designed dataset shows that our approach achieves an average improvement of 10.1 points on the TruthfulQA benchmark. Comprehensive experiments demonstrate the effectiveness of Iter-AHMCL in reducing hallucination while maintaining the general capabilities of LLMs.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve The paper aims to address the issue of hallucinations generated by large-scale language models (LLMs) during the inference process. Hallucinations refer to the generation of inaccurate or fabricated information by the model, which can lead to factual errors, inconsistencies, and fabricated content, thereby raising safety concerns. To tackle this challenge, the paper proposes a novel method—Iterative Model-Level Contrastive Learning (Iter-AHMCL), which modifies the representation layers of pre-trained LLMs to reduce hallucinations while maintaining the original capabilities of the model. ### Main Contributions 1. **Proposing the Iter-AHMCL Method**: Implementing model-level contrastive learning through adaptive development of positive and negative feature representations to effectively reduce hallucinations. 2. **Iterative Update Guidance Model**: Designing a model-level iterative strategy applicable to various LLMs, with code and all models to be released upon publication. 3. **Experimental Validation**: Conducting comprehensive experiments on various LLM models, demonstrating that the method reduces hallucinations while maintaining the general capabilities of the models. ### Method Overview 1. **Data Preparation**: Constructing a contrastive dataset for pre-training positive and negative guidance models. 2. **Utilization of Guidance Models**: Using pre-trained guidance models to adjust the direction of intermediate representations. 3. **Iterative Improvement**: Ensuring continuous adaptation and improvement of the guidance models through iterative updates to maintain optimal performance. ### Experimental Analysis 1. **GMP Training Process**: Evaluating the differences in positive and negative representations. 2. **Effectiveness in Reducing Hallucinations**: Verifying the effectiveness of Iter-AHMCL in reducing LLM hallucinations. 3. **Retention of General Capabilities**: Assessing the performance of Iter-AHMCL in retaining model knowledge and general language capabilities. 4. **Benefits of the Iterative Process**: Exploring how the iterative process helps LLMs reduce hallucinations through representation editing. 5. **Transferability of Guidance Models**: Evaluating the transferability of guidance models from one LLM base model to another. ### Experimental Setup 1. **Data Preparation**: Using PKU-SafeRLHF and Alpaca-instruction datasets. 2. **Base Model Selection**: Including models such as Alpaca, LLaMA2, LLaMA3, and Qwen. 3. **Comparison Methods**: Including original base models, LoRRA, and Pure-MG methods. 4. **Hyperparameters**: Detailed listing of hyperparameters for GMP and Iter-AHMCL. 5. **Evaluation Methods**: Using benchmarks such as TruthfulQA and MMLU for evaluation. Through these methods and experiments, the paper demonstrates the effectiveness and practicality of Iter-AHMCL in reducing LLM hallucinations.

Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

Mitigating Hallucination Issues in Small-Parameter LLMs Through Inter-Layer Contrastive Decoding

Alleviating Hallucinations of Large Language Models through Induced Hallucinations

Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused

Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding

Look Within, Why LLMs Hallucinate: A Causal Perspective

ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation

Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models

Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models

Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

Banishing LLM Hallucinations Requires Rethinking Generalization

Combating Multimodal LLM Hallucination via Bottom-up Holistic Reasoning

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Mitigating Entity-Level Hallucination in Large Language Models

Zero-Resource Hallucination Prevention for Large Language Models

Cost-Effective Hallucination Detection for LLMs