Abstract:Discovering causal structure among a set of variables is a fundamental problem in many domains. However, state-of-the-art methods seldom consider the possibility that the observational data has missing values (incomplete data), which is ubiquitous in many real-world situations. The missing value will significantly impair the performance and even make the causal discovery algorithms fail. In this paper, we propose an approach to discover causal structures from incomplete data by using a novel encoder and reinforcement learning (RL). The encoder is designed for missing data imputation as well as feature extraction. In particular, it learns to encode the currently available information (with missing values) into a robust feature representation which is then used to determine where to search the best graph. The encoder is integrated into a RL framework that can be optimized using the actor-critic algorithm. Our method takes the incomplete observational data as input and generates a causal structure graph. Experimental results on synthetic and real data demonstrate that our method can robustly generate causal structures from incomplete data. Compared with the direct combination of data imputation and causal discovery methods, our method performs generally better and can even obtain a performance gain as much as 43.2%.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of discovering causal structures from incomplete data. Specifically, existing causal discovery methods rarely consider the situation where there may be missing values (i.e., incomplete data) in the observed data, and this situation is common in many practical applications. Missing values can significantly reduce the performance of existing causal discovery algorithms and may even cause these algorithms to fail. #### Research Background 1. **The importance of causal discovery**: Causal discovery is a fundamental problem of revealing causal relationships from a set of variables and has important applications in multiple fields such as biology, economics, and genetics. 2. **Limitations of existing methods**: Most existing causal discovery methods assume that the observed data is complete, but in the real world, the observed data often contains missing values, which will lead to the decline in performance or complete failure of existing methods. 3. **Challenge**: How to effectively perform causal discovery in the presence of missing values is a key challenge in current research. #### Main contributions of the paper 1. **Proposed a reinforcement - learning - based method**: This method can learn causal graphs from incomplete data by combining an encoder and a reinforcement - learning framework to deal with the missing - value problem. 2. **Designed a special encoder**: Used to extract feature representations from incomplete observed data, enabling the entire reinforcement - learning framework to be optimized in an end - to - end manner while dealing with incomplete data. 3. **Experimental verification**: The experimental results of synthetic data and real - data show that this method has better performance than directly combining data imputation and causal discovery methods when dealing with incomplete data, with a performance improvement of up to 43.2%. ### Formula Summary - **Feature extraction and data imputation**: - The encoder consists of two parts: an imputation network (ImNet) and a feature extraction network (FeatNet). - The imputation network generates the imputed data \(X_{\text{im}}\): \[ X_{\text{im}} = \text{ImNet}(X, M) \] - The final complete data \(X_{\text{in}}\) after imputation: \[ X_{\text{in}} = (1 - M) * X_{\text{im}} + M * X \] - **Reward function**: - Use the following scoring function to evaluate the fit of the candidate graph \(G\): \[ S(G) = n \cdot d \cdot \log\left(\frac{\sum_{i = 1}^d \text{RSS}_i}{n \cdot d}\right)+ \log(n) \cdot \text{Card(edges)} \] - Acyclic constraint: \[ h(A): \text{Tr}(e^{A^{\top}A}) - d = 0 \] - Final reward: \[ \text{reward} = -[S(G)+\lambda_1 I(G\notin \text{DAGs})+\lambda_2 h(A)] \] - **Loss function**: - Use the discounted reward for optimizing the Actor as the loss function: \[ \text{Loss}=\frac{1}{d}\sum_{i = 1}^d (\text{reward}-\text{value}_i)+\lambda_3 \frac{1}{d}\log(\text{prob}) \] Through the above methods, the paper successfully solves the key problem of causal discovery from incomplete data and shows its superior performance on multiple datasets.

Causal Discovery from Incomplete Data using An Encoder and Reinforcement Learning

CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement Learning

Learning causal structures using hidden compact representation

CUTS: Neural Causal Discovery from Unstructured Time-Series Data

Causal Reasoning from Meta-reinforcement Learning

River runoff causal discovery with deep reinforcement learning

CUTS: Neural Causal Discovery from Irregular Time-Series Data

DARING: Differentiable Causal Discovery with Residual Independence

Deep Causal Learning: Representation, Discovery and Inference

Causality-driven Hierarchical Structure Discovery for Reinforcement Learning

Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

Towards Causal Relationship in Indefinite Data: Baseline Model and New Datasets

A Versatile Causal Discovery Framework to Allow Causally-Related Hidden Variables

A Survey on Causal Reinforcement Learning

LeCaSiM: Learning Causal Structure via Inverse of M-Matrices with Adjustable Coefficients

CUTS+: High-dimensional Causal Discovery from Irregular Time-series

CIER: A Novel Experience Replay Approach with Causal Inference in Deep Reinforcement Learning

Causal Deep Learning

Causal Structure Learning Supervised by Large Language Model

A Genetic Algorithm for Causal Discovery Based on Structural Causal Model.

Towards Causal Representation Learning and Deconfounding from Indefinite Data