A Novel Approach to Eliminating Hallucinations in Large Language Model-Assisted Causal Discovery

Grace Sng,Yanming Zhang,Klaus Mueller
2024-11-16
Abstract:The increasing use of large language models (LLMs) in causal discovery as a substitute for human domain experts highlights the need for optimal model selection. This paper presents the first hallucination survey of popular LLMs for causal discovery. We show that hallucinations exist when using LLMs in causal discovery so the choice of LLM is important. We propose using Retrieval Augmented Generation (RAG) to reduce hallucinations when quality data is available. Additionally, we introduce a novel method employing multiple LLMs with an arbiter in a debate to audit edges in causal graphs, achieving a comparable reduction in hallucinations to RAG.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the hallucinations generated by large - language models (LLMs) in causal discovery. Specifically, the author is concerned that when using LLMs for causal discovery, due to the lack of domain - specific training data, LLMs may generate inaccurate or logically inconsistent information. This not only affects the accuracy of the causal graph but may also mislead researchers' understanding of causal relationships. To address this challenge, the paper proposes two methods to reduce hallucinations: 1. **Retrieval Augmented Generation (RAG)**: When high - quality data is available, by combining LLMs with domain - specific knowledge, the occurrence of hallucinations can be reduced. Experimental results show that RAG can significantly reduce the hallucination rate of LLMs when auditing causal graphs. 2. **Multi - LLM debate and arbitration method**: When high - quality data cannot be obtained, this method utilizes the debates among multiple LLMs, and an arbitrating LLM synthesizes the opinions of all parties to form a final conclusion. This method can also effectively reduce hallucinations without requiring additional training data. Overall, the paper aims to improve the reliability and accuracy of LLMs in causal discovery tasks through these methods, thereby better supporting scientific research and decision - making.