Causal Discovery in Linear Non-Gaussian Acyclic Model With Multiple Latent Confounders

Wei Chen,Ruichu Cai,Kun Zhang,Zhifeng Hao
DOI: https://doi.org/10.1109/tnnls.2020.3045812
IF: 14.255
2021-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:Causal discovery from observational data is a fundamental problem in science. Though the linear non-Gaussian acyclic model (LiNGAM) has shown promising results in various applications, it still faces the following challenges in the data with multiple latent confounders: 1) how to detect the latent confounders and 2) how to uncover the causal relations among observed and latent variables. To address these two challenges, we propose a hybrid causal discovery method for the LiNGAM with multiple latent confounders (MLCLiNGAM). First, we utilize the constraint-based method to learn the causal skeleton. Second, we identify the causal directions, by conducting regression and independence tests on the adjacent pairs in the causal skeleton. Third, we detect the latent confounders with the help of the maximal clique patterns raised by the latent confounders and reconstruct the causal structure with latent variables. Theoretical results show the correctness and efficiency of the algorithms. We conduct extensive experiments on synthetic and real data, which illustrates the efficiency and effectiveness of the proposed algorithms.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?
The paper primarily aims to address the problem of discovering causal structures from observational data in the presence of multiple latent confounders. Specifically, the paper tackles two major challenges faced by the Linear Non-Gaussian Acyclic Model (LiNGAM): 1. **How to detect latent confounders**: Identifying these factors in datasets with multiple latent confounders. 2. **How to reveal causal relationships between observed and latent variables**: Determining the causal relationships between these factors and observed variables after detecting the latent confounders. To solve the above problems, the authors propose a hybrid causal discovery method (MLCLiNGAM), which mainly includes the following three steps: - **Step 1**: Use constraint-based methods (such as the PC-stable algorithm) to construct a causal skeleton, removing edges between independent variables and retaining actual undirected causal relationships. - **Step 2**: Identify partial causal order through regression and independence tests, and determine some causal directions based on the found causal order. - **Step 3**: Detect the positions of latent confounders using the maximum clique pattern and determine the causal relationships between observed variables that are not affected by the same latent confounders. The main contributions of the paper include: 1. Proposing an effective algorithm to estimate the causal structure of observed variables and latent confounders in the presence of multiple latent confounders. 2. Using the maximum clique pattern to detect the positions of latent confounders and proving the effectiveness of this method under certain assumptions. 3. Providing theoretical analysis of the algorithm's correctness and complexity. 4. Validating the effectiveness and efficiency of the proposed method through extensive experiments. In summary, this research aims to solve the problem of causal discovery in datasets with multiple latent confounders through a novel method, providing a powerful tool for understanding the mechanisms behind complex systems.