Mitigating Large Language Model Hallucination with Faithful Finetuning

Minda Hu,Bowei He,Yufei Wang,Liangyou Li,Chen Ma,Irwin King

2024-06-17

Abstract:Large language models (LLMs) have demonstrated remarkable performance on various natural language processing tasks. However, they are prone to generating fluent yet untruthful responses, known as "hallucinations". Hallucinations can lead to the spread of misinformation and cause harm in critical applications. Mitigating hallucinations is challenging as they arise from factors such as noisy data, model overconfidence, lack of knowledge, and the generation process itself. Recent efforts have attempted to address this issue through representation editing and decoding algorithms, reducing hallucinations without major structural changes or retraining. However, these approaches either implicitly edit LLMs' behavior in latent space or suppress the tendency to output unfaithful results during decoding instead of explicitly modeling on hallucination. In this work, we introduce Faithful Finetuning (F2), a novel method that explicitly models the process of faithful question answering through carefully designed loss functions during fine-tuning. We conduct extensive experiments on popular datasets and demonstrate that F2 achieves significant improvements over vanilla models and baselines.

Computation and Language

What problem does this paper attempt to address?

The paper aims to address the issue of "hallucinations" that occur when large language models (LLMs) generate text. Specifically: 1. **Problem Background**: Despite the impressive performance of large language models in natural language processing tasks, they tend to generate fluent but untrue responses, a phenomenon known as "hallucination." These inaccuracies not only reduce the reliability of the models but can also cause harm in critical applications. 2. **Research Objective**: To tackle this challenge, particularly in reducing hallucinations in question-answering (QA) tasks, the paper proposes a method called "Faithful Finetuning" (F2). This method explicitly models the process of generating faithful responses by designing a clear loss function during the finetuning process. 3. **Specific Approach**: The F2 method first decomposes the traditional QA objective into two sub-goals—internal fact retrieval and fact-based QA—and designs targeted finetuning strategies to enhance the model's ability to utilize factual information. Additionally, by identifying layers and hotspot areas where the model is prone to hallucinations and applying weighted training, the method further improves the model's accuracy and reliability. In summary, the goal of the paper is to significantly reduce the occurrence of hallucinations in text generation by large language models through a new finetuning method, thereby enhancing their reliability and trustworthiness in practical applications.

Mitigating Large Language Model Hallucination with Faithful Finetuning

Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused

Beyond Fine-Tuning: Effective Strategies for Mitigating Hallucinations in Large Language Models for Data Analytics

A Debate-Driven Experiment on LLM Hallucinations and Accuracy

Unfamiliar Finetuning Examples Control How Language Models Hallucinate

Mitigating Hallucination Issues in Small-Parameter LLMs Through Inter-Layer Contrastive Decoding

Fine-grained Hallucination Detection and Editing for Language Models

Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning

Alleviating Hallucinations of Large Language Models through Induced Hallucinations

Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators

Towards Mitigating Hallucination in Large Language Models via Self-Reflection

Fine-Tuning Large Language Models to Appropriately Abstain with Semantic Entropy

On-Policy Fine-grained Knowledge Feedback for Hallucination Mitigation

PFME: A Modular Approach for Fine-grained Hallucination Detection and Editing of Large Language Models

Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation

A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation

Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning

TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation