Abstract:Causal discovery is the challenging task of inferring causal structure from data. Motivated by Pearl's Causal Hierarchy (PCH), which tells us that passive observations alone are not enough to distinguish correlation from causation, there has been a recent push to incorporate interventions into machine learning research. Reinforcement learning provides a convenient framework for such an active approach to learning. This paper presents CORE, a deep reinforcement learning-based approach for causal discovery and intervention planning. CORE learns to sequentially reconstruct causal graphs from data while learning to perform informative interventions. Our results demonstrate that CORE generalizes to unseen graphs and efficiently uncovers causal structures. Furthermore, CORE scales to larger graphs with up to 10 variables and outperforms existing approaches in structure estimation accuracy and sample efficiency. All relevant code and supplementary material can be found at

What problem does this paper attempt to address?

The problem that this paper attempts to solve is in the field of Causal Discovery (CD), how to use Reinforcement Learning (RL) to design an algorithm that can efficiently and scalably infer causal structures from data. Specifically, the paper proposes CORE (Causal Discovery with Reinforcement Learning), which is a deep - reinforcement - learning - based method aiming to learn the reconstruction of causal graphs through active interventions and perform effective interventions simultaneously. ### Main Problems 1. **Distinguishing Causality from Correlation**: Traditional causal discovery methods mainly rely on observational data, which makes it difficult to distinguish between correlation and causality. According to Pearl's Causal Hierarchy (PCH), passive observation alone cannot fully distinguish the two, so interventions need to be introduced. 2. **Handling Large - Scale Causal Graphs**: Existing causal discovery methods face challenges when dealing with large - scale causal graphs, especially in terms of the accuracy of structure estimation and sample efficiency. 3. **Optimizing Intervention Strategies**: How to design effective intervention strategies to obtain as much useful information as possible within a limited number of interventions is also a key issue. ### Solutions CORE solves the above problems in the following ways: 1. **Partially Observable Markov Decision Process (POMDP) Modeling**: Formalize the causal discovery task as a POMDP, so that it can handle partially observable environmental states. 2. **Dual - Q - Learning Framework**: Propose a dual - Q - learning setup to learn intervention design and structure estimation separately, which can perform these two tasks more effectively at the same time. 3. **Efficient Reward Mechanism**: Improve the learning efficiency through a dense reward mechanism instead of providing rewards only at the end of each episode. 4. **Pre - defined Training Set**: Use a pre - defined training set instead of generating graphs in real - time, avoiding significant computational overhead and improving training efficiency. ### Experimental Verification The paper verifies the effectiveness and generalization ability of CORE through experiments: - **Experimental Data**: Generate causal graphs with 3 to 10 variables and divide them into training sets and test sets. - **Performance Comparison**: Compare with existing methods such as MCD and random baselines. The results show that CORE is superior to other methods in terms of the accuracy of structure estimation and sample efficiency. - **Generalization Ability**: CORE can be successfully applied to unseen causal graphs and still maintain high accuracy in graphs with 10 variables. ### Conclusion CORE successfully solves the problems of efficient reconstruction of large - scale causal graphs and optimization of intervention strategies, demonstrating its potential in practical applications. By introducing reinforcement learning, CORE not only improves the accuracy of causal discovery but also significantly enhances training efficiency and generalization ability.

CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning

Active Learning of Causal Networks with Intervention Experiments and Optimal Designs

Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

Ordering-Based Causal Discovery with Reinforcement Learning

Causal Discovery from Incomplete Data using An Encoder and Reinforcement Learning

Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation

Causal Reasoning from Meta-reinforcement Learning

Causal Question Answering with Reinforcement Learning

Disentangling causal effects for hierarchical reinforcement learning

Causal Discovery by Graph Attention Reinforcement Learning.

Causal Reinforcement Learning using Observational and Interventional Data

Scalable Causal Graph Learning Through a Deep Neural Network

Causal Deep Learning

Causality is all you need

Optimization of Active Learning Strategies for Causal Network Structure

Deep End-to-end Causal Inference

Reinforcement Causal Structure Learning on Order Graph

Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy

Hierarchical Graph Neural Networks for Causal Discovery and Root Cause Localization

Causality-driven Hierarchical Structure Discovery for Reinforcement Learning

Causal Reinforcement Learning for Knowledge Graph Reasoning