Causal Discovery with Language Models as Imperfect Experts

Stephanie Long,Alexandre Piché,Valentina Zantedeschi,Tibor Schuster,Alexandre Drouin
2023-07-06
Abstract:Understanding the causal relationships that underlie a system is a fundamental prerequisite to accurate decision-making. In this work, we explore how expert knowledge can be used to improve the data-driven identification of causal graphs, beyond Markov equivalence classes. In doing so, we consider a setting where we can query an expert about the orientation of causal relationships between variables, but where the expert may provide erroneous information. We propose strategies for amending such expert knowledge based on consistency properties, e.g., acyclicity and conditional independencies in the equivalence class. We then report a case study, on real data, where a large language model is used as an imperfect expert.
Artificial Intelligence,Computation and Language,Machine Learning
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper attempts to address how to utilize imperfect expert knowledge to reduce uncertainty in causal graph identification in causal discovery. Specifically, the authors focus on how to narrow down the possible causal graphs by introducing expert knowledge when data-driven methods can only identify the Markov Equivalence Class (MEC) of the causal graph. However, experts in the real world may provide incorrect information, so the paper proposes a strategy to improve causal graph identification while maintaining consistency (such as acyclicity and conditional independence) using these imperfect expert knowledge. ### Main Contributions 1. **Formalizing the Problem**: The application of imperfect expert knowledge in causal discovery is formalized as an optimization problem, aiming to minimize the size of the MEC while ensuring that the true graph is still contained within the new MEC. 2. **Greedy Algorithm**: A greedy algorithm based on Bayesian inference is proposed, which incrementally incorporates expert knowledge to optimize the objective function. 3. **Experimental Evaluation**: The performance of the method is evaluated on real data, including using experts who return the correct direction with a fixed probability and large language models as imperfect experts. 4. **Empirical Analysis**: The effectiveness of the method on different networks is experimentally verified, and the performance of large language models as experts is discussed. ### Background and Related Work - **Causal Bayesian Networks**: Defines a causal graph (DAG) and its corresponding probability distribution. - **Causal Discovery**: The task of recovering causal graphs from data. Existing methods are usually divided into constraint-based and score-based methods, but both have the problem of not being able to fully determine the true causal graph. - **Equivalence Class**: MEC is a set of graphs with the same conditional independence, leading to uncertainty in downstream tasks. - **Expert Knowledge**: Previous research assumes that the knowledge provided by experts is correct, while this paper considers the possibility of experts making mistakes. ### Method - **Problem Setting**: Defines a model of imperfect experts and how to use this expert knowledge to optimize the size of the MEC. - **Noise Model**: Defines a probability model for expert errors. - **Bayesian Method**: Proposes methods for calculating prior and posterior probabilities to estimate the reliability of expert decisions. - **Greedy Strategy**: Proposes two greedy strategies (Ssize and Srisk) to select the best edges for orientation. ### Experimental Results - **ε-Experts**: On all networks, the method combining the two strategies can reduce the size of the MEC while maintaining the probability of the true graph being at least 1-η. - **Large Language Model Experts**: Overall, large language model experts can extract some causal-related knowledge and reduce SHD, but perform worse than ε-experts on some datasets, especially on the ALARM dataset. ### Conclusion The paper investigates how to use imperfect expert knowledge to improve the output of causal discovery algorithms. The proposed method shows effectiveness both theoretically and experimentally, especially when the experts meet the assumptions. However, when using large language models as experts, the performance decreases but still shows potential value. Future research can explore noise models more suitable for large language models and improve query methods.