Supervised Whole DAG Causal Discovery

Hebi Li,Qi Xiao,Jin Tian
DOI: https://doi.org/10.48550/arXiv.2006.04697
2020-06-08
Abstract:We propose to address the task of causal structure learning from data in a supervised manner. Existing work on learning causal directions by supervised learning is restricted to learning pairwise relation, and not well suited for whole DAG discovery. We propose a novel approach of modeling the whole DAG structure discovery as a supervised learning. To fit the problem in hand, we propose to use permutation equivariant models that align well with the problem domain. We evaluate the proposed approach extensively on synthetic graphs of size 10,20,50,100 and real data, and show promising results compared with a variety of previous approaches.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is **learning causal structures (Directed Acyclic Graph, DAG) from data**. Specifically, the author proposes a new method, which models the discovery of the entire DAG structure as a supervised learning problem, in order to overcome the limitations of existing methods in global DAG structure discovery. ### 1. Limitations of Existing Methods Existing causal direction learning methods mainly focus on **learning pairwise relationships**, and these methods are not suitable for the discovery of the entire DAG. For example: - **Constraint - based algorithms** (such as PC, FCI) rely on conditional independence tests to infer the graph structure. - **Score - based methods** (such as GES, CAM) find the highest - scoring structure by searching all valid DAGs. - **Continuous optimization methods** (such as NOTEARS) model the acyclic constraint as an equality constraint and solve it by the augmented Lagrangian method. - **Pairwise supervised learning methods** (such as RCC, MRCL) can only infer the causal direction between pairwise variables and cannot well discover the global DAG structure. ### 2. The Proposed New Method The author proposes a brand - new method, which models the discovery of the entire DAG structure as a supervised learning problem. Specific contributions include: - **Modeling the discovery of the entire DAG structure as a supervised learning problem for the first time**. - **Using equivariant models** to capture the intrinsic features of data, and developing a DAG structure learning algorithm based on equivariant deep neural networks (called DAG - EQ). - **Conducting extensive evaluations on synthetic datasets and real - world data**, and comparing with multiple traditional methods, showing promising results. ### 3. Key Points of the Method - **Characterization**: Use the Pearson correlation coefficient matrix of the input data distribution as a feature vector. - **Baseline Models**: Consider fully - connected neural networks (FC) and convolutional neural networks (CNN) as baseline models. - **Equivariant Model**: Utilize the equivariant model to ensure that when the order of input variables changes, the output adjacency matrix also changes accordingly. - **Training and Inference**: Use the binary cross - entropy loss function for training, and directly obtain the DAG structure through forward propagation during inference. ### 4. Experimental Results The experimental results show that DAG - EQ performs well on different types of graphs (such as Scale - free and Erdos - Renyi graphs) and graphs of different scales. In particular, its performance on large - scale graphs is better than that of traditional constraint - based and score - based methods. In addition, DAG - EQ also shows good transferability and generalization ability, and can perform causal structure discovery on unseen graphs. In summary, the main goal of this paper is to improve the accuracy and efficiency of the discovery of the entire DAG structure through supervised learning methods, especially for applications on large - scale and complex datasets.