Abstract:Graph-based causal discovery methods aim to capture conditional independencies consistent with the observed data and differentiate causal relationships from indirect or induced ones. Successful construction of graphical models of data depends on the assumption of causal sufficiency: that is, that all confounding variables are measured. When this assumption is not met, learned graphical structures may become arbitrarily incorrect and effects implied by such models may be wrongly attributed, carry the wrong magnitude, or mis-represent direction of correlation. Wide application of graphical models to increasingly less curated "big data" draws renewed attention to the unobserved confounder problem. We present a novel method that aims to control for the latent space when estimating a DAG by iteratively deriving proxies for the latent space from the residuals of the inferred model. Under mild assumptions, our method improves structural inference of Gaussian graphical models and enhances identifiability of the causal effect. In addition, when the model is being used to predict outcomes, it un-confounds the coefficients on the parents of the outcomes and leads to improved predictive performance when out-of-sample regime is very different from the training data. We show that any improvement of prediction of an outcome is intrinsically capped and cannot rise beyond a certain limit as compared to the confounded model. We extend our methodology beyond GGMs to ordinal variables and nonlinear cases. Our R package provides both PCA and autoencoder implementations of the methodology, suitable for GGMs with some guarantees and for better performance in general cases but without such guarantees.

Identifiability in robust estimation of tree structured models

Robust Estimation of Tree Structured Ising Models

Learning latent tree models with small query complexity

Identifiability and Identification of Switching Dynamical Networks: A Data-Based Approach

Identifiability and Consistent Estimation for Gaussian Chain Graph Models

Chernoff Information Between Gaussian Trees

Blessing of Dependence: Identifiability and Geometry of Discrete Models with Multiple Binary Latent Variables

Identifiability in Continuous Lyapunov Models

Exact Asymptotics for Learning Tree-Structured Graphical Models with Side Information: Noiseless and Noisy Samples

Identifiability of the Rooted Tree Parameter under the Cavender-Farris-Neyman Model with a Molecular Clock

Learning a tree-structured Ising model in order to make predictions

Ising Model on Locally Tree-like Graphs: Uniqueness of Solutions to Cavity Equations

Identifiability of a statistical model with two latent vectors: Importance of the dimensionality relation and application to graph embedding

An Information Theoretic Measure of Judea Pearl's Identifiability and Causal Influence

Simultaneous Identification of Sparse Structures and Communities in Heterogeneous Graphical Models

Sample-Optimal and Efficient Learning of Tree Ising models

Adversarially-Robust Inference on Trees via Belief Propagation

Identification of Latent Variables From Graphical Model Residuals

Chernoff Information of Bottleneck Gaussian Trees.

Identifiability of local and global features of phylogenetic networks from average distances

Generalizing Tree Probability Estimation Via Bayesian Networks