Abstract:We consider linear non-Gaussian structural equation models that involve latent confounding. In this setting, the causal structure is identifiable, but, in general, it is not possible to identify the specific causal effects. Instead, a finite number of different causal effects result in the same observational distribution. Most existing algorithms for identifying these causal effects use overcomplete independent component analysis (ICA), which often suffers from convergence to local optima. Furthermore, the number of latent variables must be known a priori. To address these issues, we propose an algorithm that operates recursively rather than using overcomplete ICA. The algorithm first infers a source, estimates the effect of the source and its latent parents on their descendants, and then eliminates their influence from the data. For both source identification and effect size estimation, we use rank conditions on matrices formed from higher-order cumulants. We prove asymptotic correctness under the mild assumption that locally, the number of latent variables never exceeds the number of observed variables. Simulation studies demonstrate that our method achieves comparable performance to overcomplete ICA even though it does not know the number of latents in advance.

What problem does this paper attempt to address?

This paper aims to solve the problem of causal structure identification in linear non - Gaussian causal models in the presence of latent confounding variables. Specifically, the author focuses on linear non - Gaussian structural equation models. In these models, when latent confounding variables are unobserved, although the causal structure can be identified, the specific causal effects usually cannot be uniquely determined. Instead, different causal effects may lead to the same observed distribution. Existing algorithms mainly rely on overcomplete independent component analysis (ICA), but this method is prone to getting trapped in local optima and requires prior knowledge of the number of latent variables. To solve these problems, the author proposes a recursive algorithm, which gradually identifies the causal structure and effects through the following steps: 1. **Source Node Identification**: First, identify a source node and its potential parent nodes through the matrix rank condition. 2. **Effect Estimation**: Estimate the influence of the source node and its potential parent nodes on other nodes. 3. **Influence Elimination**: Eliminate the influence of the identified nodes from the data and then continue to process the remaining data. The main contributions of this method are: - **Recursive Identification**: Gradually identify the causal structure in a recursive manner, avoiding the limitations of overcomplete ICA. - **No Prior Knowledge Required**: There is no need to know the number of latent variables in advance. - **Asymptotic Correctness**: Under the assumption that the number of latent variables does not exceed the number of observed variables locally, the asymptotic correctness of the algorithm is proved. Through simulation studies, the author shows that this method has performance comparable to that of overcomplete ICA and can work effectively even when the number of latent variables is unknown.

Causal Discovery of Linear Non-Gaussian Causal Models with Unobserved Confounding

Confidence in Causal Discovery with Linear Causal Models

Nonlinear causal discovery with confounders

A Linear Non-Gaussian Acyclic Model for Causal Discovery

Causal Discovery under Latent Class Confounding

A linear non-Gaussian acyclic model for causal discovery.

The DeCAMFounder: Non-Linear Causal Discovery in the Presence of Hidden Variables

Causal Discovery in Linear Non-Gaussian Acyclic Model With Multiple Latent Confounders

Causal Discovery in Linear Structural Causal Models with Deterministic Relations

Local Causal Discovery with Linear non-Gaussian Cyclic Models

Induced Covariance for Causal Discovery in Linear Sparse Structures

Causal Discovery in Linear Models with Unobserved Variables and Measurement Error

Causal Discovery with Latent Confounders Based on Higher-Order Cumulants

Constraint-based Causal Discovery for Non-Linear Structural Causal Models with Cycles and Latent Confounders

Generalized Independent Noise Condition for Estimating Causal Structure with Latent Variables

Partial Homoscedasticity in Causal Discovery with Linear Models

Controlling for discrete unmeasured confounding in nonlinear causal models

Discovery of non-gaussian linear causal models using ICA

Causal Discovery with General Non-Linear Relationships Using Non-Linear ICA

Differentiable Causal Discovery For Latent Hierarchical Causal Models

Hybrid Top-Down Global Causal Discovery with Local Search for Linear and Nonlinear Additive Noise Models