Bayesian learning of Causal Structure and Mechanisms with GFlowNets and Variational Bayes

Mizu Nishikawa-Toomey,Tristan Deleu,Jithendaraa Subramanian,Yoshua Bengio,Laurent Charlin
2024-06-04
Abstract:Bayesian causal structure learning aims to learn a posterior distribution over directed acyclic graphs (DAGs), and the mechanisms that define the relationship between parent and child variables. By taking a Bayesian approach, it is possible to reason about the uncertainty of the causal model. The notion of modelling the uncertainty over models is particularly crucial for causal structure learning since the model could be unidentifiable when given only a finite amount of observational data. In this paper, we introduce a novel method to jointly learn the structure and mechanisms of the causal model using Variational Bayes, which we call Variational Bayes-DAG-GFlowNet (VBG). We extend the method of Bayesian causal structure learning using GFlowNets to learn not only the posterior distribution over the structure, but also the parameters of a linear-Gaussian model. Our results on simulated data suggest that VBG is competitive against several baselines in modelling the posterior over DAGs and mechanisms, while offering several advantages over existing methods, including the guarantee to sample acyclic graphs, and the flexibility to generalize to non-linear causal mechanisms.
Machine Learning
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily addresses two core issues in causal modeling: 1. **Joint Learning of Causal Structure and Mechanisms**: The paper proposes a novel method, Variational Bayes-DAG-GFlowNet (VBG), which uses variational Bayesian methods to jointly learn the causal graph structure (DAG) and its mechanism parameters. This method not only infers the graph structure but also the parameters of linear Gaussian models between continuous random variables. 2. **Uncertainty Quantification**: Through Bayesian methods, this approach can quantify uncertainty in causal models. This is particularly important when there is limited observational data, as the model may not be fully identifiable. The paper presents experimental results on both simulated and real data, showing that VBG is competitive with other baseline methods in modeling the posterior distribution of DAGs and mechanisms, with several advantages such as ensuring the generated graph is acyclic and allowing for infinite sampling from the posterior once training is complete. The paper also discusses other related works, particularly the application of gradient descent-based methods in causal structure learning, and points out some limitations of current methods. For example, methods like DiBS and VCN can infer graph structures but do not guarantee that the generated graphs are acyclic; while BCD-Nets ensure acyclicity but lack flexibility in mechanism parameterization. In contrast, the proposed VBG method overcomes these limitations and performs well in experimental evaluations.