Redefining "Hallucination" in LLMs: Towards a psychology-informed framework for mitigating misinformation

Elijah Berberette,Jack Hutchins,Amir Sadovnik
2024-02-01
Abstract:In recent years, large language models (LLMs) have become incredibly popular, with ChatGPT for example being used by over a billion users. While these models exhibit remarkable language understanding and logical prowess, a notable challenge surfaces in the form of "hallucinations." This phenomenon results in LLMs outputting misinformation in a confident manner, which can lead to devastating consequences with such a large user base. However, we question the appropriateness of the term "hallucination" in LLMs, proposing a psychological taxonomy based on cognitive biases and other psychological phenomena. Our approach offers a more fine-grained understanding of this phenomenon, allowing for targeted solutions. By leveraging insights from how humans internally resolve similar challenges, we aim to develop strategies to mitigate LLM hallucinations. This interdisciplinary approach seeks to move beyond conventional terminology, providing a nuanced understanding and actionable pathways for improvement in LLM reliability.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the "hallucination" phenomenon in large - language models (LLMs), that is, the misleading and potential harm caused when the model confidently generates seemingly reasonable but actually wrong information. With the wide application of these models, this hallucination phenomenon may lead to serious consequences, especially in the case of having a large user base. Specifically, the paper mainly focuses on the following aspects: 1. **Appropriateness of terms**: The paper questions whether it is appropriate to use the term "hallucination" to describe this phenomenon in LLMs. The author points out that human hallucinations are essentially different from the phenomena in LLMs, so more precise terms are needed to describe and classify these phenomena. 2. **Application of psychological terms**: In order to more accurately describe and understand the hallucination phenomenon in LLMs, the author proposes to draw on cognitive biases and other psychological phenomena in psychology to construct a classification framework based on psychology. This includes but is not limited to: - **Source Amnesia**: The model cannot accurately trace the information source when generating content. - **Recency Effect**: The model is more inclined to rely on the most recently accessed information. - **Availability Heuristics**: The model generates content according to the most easily accessible information in the training data. - **Suggestibility**: The model is influenced by user prompts and thus produces biases. - **Cognitive Dissonance**: The internal conflict generated by the model when processing contradictory information. - **Confabulation**: The model generates confident but misleading outputs, similar to humans unconsciously fabricating false memories. 3. **Directions of solutions**: By introducing psychological concepts, the author hopes to provide new perspectives and methods for understanding and alleviating the hallucination phenomenon in LLMs, thereby improving the reliability and accuracy of these models. In conclusion, this paper aims to re - define and classify the hallucination phenomenon in LLMs, and by introducing terms and theories in the field of psychology, provide more refined and effective strategies for solving this problem.