CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning

Andreas W.M. Sauter,Nicolò Botteghi,Erman Acar,Aske Plaat
2024-01-30
Abstract:Causal discovery is the challenging task of inferring causal structure from data. Motivated by Pearl's Causal Hierarchy (PCH), which tells us that passive observations alone are not enough to distinguish correlation from causation, there has been a recent push to incorporate interventions into machine learning research. Reinforcement learning provides a convenient framework for such an active approach to learning. This paper presents CORE, a deep reinforcement learning-based approach for causal discovery and intervention planning. CORE learns to sequentially reconstruct causal graphs from data while learning to perform informative interventions. Our results demonstrate that CORE generalizes to unseen graphs and efficiently uncovers causal structures. Furthermore, CORE scales to larger graphs with up to 10 variables and outperforms existing approaches in structure estimation accuracy and sample efficiency. All relevant code and supplementary material can be found at
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is in the field of Causal Discovery (CD), how to use Reinforcement Learning (RL) to design an algorithm that can efficiently and scalably infer causal structures from data. Specifically, the paper proposes CORE (Causal Discovery with Reinforcement Learning), which is a deep - reinforcement - learning - based method aiming to learn the reconstruction of causal graphs through active interventions and perform effective interventions simultaneously. ### Main Problems 1. **Distinguishing Causality from Correlation**: Traditional causal discovery methods mainly rely on observational data, which makes it difficult to distinguish between correlation and causality. According to Pearl's Causal Hierarchy (PCH), passive observation alone cannot fully distinguish the two, so interventions need to be introduced. 2. **Handling Large - Scale Causal Graphs**: Existing causal discovery methods face challenges when dealing with large - scale causal graphs, especially in terms of the accuracy of structure estimation and sample efficiency. 3. **Optimizing Intervention Strategies**: How to design effective intervention strategies to obtain as much useful information as possible within a limited number of interventions is also a key issue. ### Solutions CORE solves the above problems in the following ways: 1. **Partially Observable Markov Decision Process (POMDP) Modeling**: Formalize the causal discovery task as a POMDP, so that it can handle partially observable environmental states. 2. **Dual - Q - Learning Framework**: Propose a dual - Q - learning setup to learn intervention design and structure estimation separately, which can perform these two tasks more effectively at the same time. 3. **Efficient Reward Mechanism**: Improve the learning efficiency through a dense reward mechanism instead of providing rewards only at the end of each episode. 4. **Pre - defined Training Set**: Use a pre - defined training set instead of generating graphs in real - time, avoiding significant computational overhead and improving training efficiency. ### Experimental Verification The paper verifies the effectiveness and generalization ability of CORE through experiments: - **Experimental Data**: Generate causal graphs with 3 to 10 variables and divide them into training sets and test sets. - **Performance Comparison**: Compare with existing methods such as MCD and random baselines. The results show that CORE is superior to other methods in terms of the accuracy of structure estimation and sample efficiency. - **Generalization Ability**: CORE can be successfully applied to unseen causal graphs and still maintain high accuracy in graphs with 10 variables. ### Conclusion CORE successfully solves the problems of efficient reconstruction of large - scale causal graphs and optimization of intervention strategies, demonstrating its potential in practical applications. By introducing reinforcement learning, CORE not only improves the accuracy of causal discovery but also significantly enhances training efficiency and generalization ability.