Gradient-based Causal Structure Learning with Normalizing Flow

Xiongren Chen
DOI: https://doi.org/10.48550/arXiv.2010.03095
2020-10-07
Abstract:In this paper, we propose a score-based normalizing flow method called DAG-NF to learn dependencies of input observation data. Inspired by Grad-CAM in computer vision, we use jacobian matrix of output on input as causal relationships and this method can be generalized to any neural networks especially for flow-based generative neural networks such as Masked Autoregressive Flow(MAF) and Continuous Normalizing Flow(CNF) which compute the log likelihood loss and divergence of distribution of input data and target distribution. This method extends NOTEARS which enforces a important acylicity constraint on continuous adjacency matrix of graph nodes and significantly reduce the computational complexity of search space of graph.
Machine Learning
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **How to find meaningful causal relationships from a large amount of non - sequential observational data**. Specifically, the author proposes a gradient - based normalizing flow method (DAG - NF) to learn the dependency relationships between input observational data. ### Problem Background In the field of data science, discovering meaningful relationships, especially causal relationships, is one of the important directions for creating commercial value and conducting scientific research. However, although the traditional random experiment method is effective, it is difficult to implement or too costly in many cases. Therefore, researchers have to rely on observational data to infer causal relationships. Existing causal inference methods are mainly divided into three categories: 1. **Constraint - based methods**: Verify the specified structure through conditional independence tests. 2. **Score - based methods**: Use a scoring function to quantify the fit degree between the Bayesian network and the given data distribution, and find the best graph structure through a search algorithm. 3. **Structural causal model - based methods**: Describe the mechanism of generating data and distinguish causal variables. ### Solution Proposed in the Paper The paper proposes a score - based normalizing flow method - **DAG - NF**, and the main contributions of this method include: 1. **General Framework**: Provide a neural network framework applicable to non - sequential data, which can calculate the Jacobian matrix of the output with respect to the input, thereby inferring causal relationships. This framework can be applied to any neural network architecture that can calculate the Jacobian matrix, such as the MLP neural network or generative models such as MAF (Masked Autoregressive Flow). 2. **Self - Shielding Architecture**: Design a new architecture with a self - shielding mechanism to obtain the causal relationships between input variables. 3. **Experimental Verification**: Compare the existing state - of - the - art causal inference methods through multiple experiments. The results show that DAG - NF is competitive in all experiments and is more flexible because it only requires that the neural network can calculate the Jacobian matrix. ### Key Technical Points - **Jacobian Matrix as Causal Dependence**: Calculate the Jacobian matrix of the output with respect to the input \( J=\left[\frac{\partial f}{\partial x_{1}} \cdots \frac{\partial f}{\partial x_{d}}\right] \), and use its L2 - norm to define the weighted adjacency matrix \( W(f)=\|J\|_{L2} \) to represent the causal dependence relationship in the non - linear expansion. - **Conditional Independence Scoring Function**: Decompose the joint probability distribution into the product of simple known distributions, and optimize the model parameters through the maximum log - likelihood loss function. - **Augmented Lagrangian Method**: Add the acyclic constraint \( h(W(f)) = 0 \) to the loss function, and ensure the feasibility and optimal solution of the optimization process through the augmented Lagrangian method. ### Experimental Results The paper conducts experiments on synthetic data and real data. The results show that DAG - NF is superior to other methods in multiple indicators, especially when dealing with complex causal structures. The specific experimental results are shown in Table 1 and Figures 1, 2, and 3. ### Conclusion This paper proposes a Jacobian - matrix - based normalizing flow method DAG - NF for learning causal relationships from observational data. This method extends the NOTEARS method and significantly improves the effect of causal structure learning while maintaining a relatively low computational complexity.