Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators

Dingkang Yang,Dongling Xiao,Jinjie Wei,Mingcheng Li,Zhaoyu Chen,Ke Li,Lihua Zhang

2024-08-22

Abstract:Despite their remarkable capabilities, Large Language Models (LLMs) are prone to generate responses that contradict verifiable facts, i.e., unfaithful hallucination content. Existing efforts generally focus on optimizing model parameters or editing semantic representations, which compromise the internal factual knowledge of target LLMs. In addition, hallucinations typically exhibit multifaceted patterns in downstream tasks, limiting the model's holistic performance across tasks. In this paper, we propose a Comparator-driven Decoding-Time (CDT) framework to alleviate the response hallucination. Firstly, we construct hallucinatory and truthful comparators with multi-task fine-tuning samples. In this case, we present an instruction prototype-guided mixture of experts strategy to enhance the ability of the corresponding comparators to capture different hallucination or truthfulness patterns in distinct task instructions. CDT constrains next-token predictions to factuality-robust distributions by contrasting the logit differences between the target LLMs and these comparators. Systematic experiments on multiple downstream tasks show that our framework can significantly improve the model performance and response factuality.

Computation and Language

What problem does this paper attempt to address?

This paper attempts to solve the problem that large - language models (LLMs) are prone to generate content that contradicts verifiable facts when generating responses, namely unfaithful hallucination content. Although LLMs perform well in many tasks, the content they generate often contains seemingly reasonable but actually incorrect statements, which limits the credibility and reliability of these models in practical applications. Specifically, the paper points out that existing solutions usually focus on optimizing model parameters or editing semantic representations, and these methods may damage the factual knowledge inside the target LLMs. In addition, hallucination content usually exhibits multi - aspect patterns in downstream tasks, limiting the overall performance of the model in different tasks. To address these problems, the paper proposes a comparator - driven decoding - time framework (Comparator - driven Decoding - Time, CDT), aiming to reduce the hallucination phenomenon in responses. By constructing hallucination and real comparators and comparing the logit differences between these comparators and the target LLMs during the decoding process, the CDT framework can constrain the prediction of the next word to make it more in line with the facts. Experimental results show that this framework significantly improves the performance of the model and the authenticity of responses in multiple downstream tasks.

Improving Factuality in Large Language Models via Decoding-Time Hallucinatory and Truthful Comparators

Alleviating Hallucinations of Large Language Models through Induced Hallucinations

Improving Factuality by Contrastive Decoding with Factual and Hallucination Prompts

Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused

Mitigating Hallucination Issues in Small-Parameter LLMs Through Inter-Layer Contrastive Decoding

TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space

Drowzee: Metamorphic Testing for Fact-Conflicting Hallucination Detection in Large Language Models

A Debate-Driven Experiment on LLM Hallucinations and Accuracy

Mitigating Large Language Model Hallucination with Faithful Finetuning

The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models

Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation

On Large Language Models' Hallucination with Regard to Known Facts

On the Universal Truthfulness Hyperplane Inside LLMs

Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning

Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

FactTest: Factuality Testing in Large Language Models with Finite-Sample and Distribution-Free Guarantees

Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories