Sources of Hallucination by Large Language Models on Inference Tasks

Nick McKenna,Tianyi Li,Liang Cheng,Mohammad Javad Hosseini,Mark Johnson,Mark Steedman
2023-10-23
Abstract:Large Language Models (LLMs) are claimed to be capable of Natural Language Inference (NLI), necessary for applied tasks like question answering and summarization. We present a series of behavioral studies on several LLM families (LLaMA, GPT-3.5, and PaLM) which probe their behavior using controlled experiments. We establish two biases originating from pretraining which predict much of their behavior, and show that these are major sources of hallucination in generative LLMs. First, memorization at the level of sentences: we show that, regardless of the premise, models falsely label NLI test samples as entailing when the hypothesis is attested in training data, and that entities are used as ``indices'' to access the memorized data. Second, statistical patterns of usage learned at the level of corpora: we further show a similar effect when the premise predicate is less frequent than that of the hypothesis in the training data, a bias following from previous studies. We demonstrate that LLMs perform significantly worse on NLI test samples which do not conform to these biases than those which do, and we offer these as valuable controls for future LLM evaluation.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the false positive hallucination of large - language models (LLMs) in natural language inference (NLI) tasks. Specifically, through a series of behavioral studies, the author explores two biases exhibited by LLMs when performing NLI tasks: 1. **Attestation Bias**: LLMs tend to confirm the entailment relationship on hypotheses that have appeared in their training data, even if the premise is irrelevant to the hypothesis. This bias indicates that LLMs rely too much on their propositional memory of query statements rather than reasoning based on the provided premise. 2. **Relative Frequency Bias**: LLMs use a simple rule to determine the entailment relationship, that is, if the frequency of an event in the premise is lower than that of the event in the hypothesis in the training data, it is more likely to confirm the entailment relationship. This bias reflects the statistical characteristics in natural text, but has no direct relation to the actual meaning. By designing and conducting behavioral experiments targeting these biases, the paper reveals the reasons for the poor performance of LLMs in NLI tasks and proposes control conditions that should be considered when evaluating the performance of LLMs in the future. These findings are of great significance for understanding the working mechanism of LLMs and their potential risks in practical applications.