Banishing LLM Hallucinations Requires Rethinking Generalization

Johnny Li,Saksham Consul,Eda Zhou,James Wong,Naila Farooqui,Yuxin Ye,Nithyashree Manohar,Zhuxiaona Wei,Tian Wu,Ben Echols,Sharon Zhou,Gregory Diamos
2024-06-25
Abstract:Despite their powerful chat, coding, and reasoning abilities, Large Language Models (LLMs) frequently hallucinate. Conventional wisdom suggests that hallucinations are a consequence of a balance between creativity and factuality, which can be mitigated, but not eliminated, by grounding the LLM in external knowledge sources. Through extensive systematic experiments, we show that these traditional approaches fail to explain why LLMs hallucinate in practice. Specifically, we show that LLMs augmented with a massive Mixture of Memory Experts (MoME) can easily memorize large datasets of random numbers. We corroborate these experimental findings with a theoretical construction showing that simple neural networks trained to predict the next token hallucinate when the training loss is above a threshold as it usually does in practice when training on internet scale data. We interpret our findings by comparing against traditional retrieval methods for mitigating hallucinations. We use our findings to design a first generation model for removing hallucinations -- Lamini-1 -- that stores facts in a massive mixture of millions of memory experts that are retrieved dynamically.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper attempts to address the hallucination problem in large language models (LLMs). Despite these models demonstrating strong capabilities in areas such as chatting, programming, and reasoning, they often produce hallucinations, generating content that is not factually accurate. The traditional view holds that hallucinations are a result of balancing creativity and factuality, which can be mitigated by combining the model with external knowledge sources but cannot be completely eliminated. However, through systematic experiments, the authors found that these traditional methods fail to explain why LLMs hallucinate in practice. Specifically, the authors show that even when a large amount of random data is added during training, LLMs can still easily memorize this data without significantly increasing generalization error. Furthermore, their theoretical analysis indicates that simple neural networks will hallucinate when predicting the next word if the training loss exceeds a certain threshold, which typically occurs in practice, especially when training with internet-scale data. Based on these findings, the authors designed a new model architecture—Lamini-1, which reduces hallucinations through the dynamic retrieval of a large mixture of memory experts (MoME) that store facts. This research not only helps improve the interpretability and practicality of LLMs, especially in fields requiring precise answers, but also may guide more reasonable and reliable model architecture designs.