Causal Discovery and Prediction: Methods and Algorithms

Gilles Blondel
DOI: https://doi.org/10.48550/arXiv.2309.09416
2023-09-18
Abstract:We are not only observers but also actors of reality. Our capability to intervene and alter the course of some events in the space and time surrounding us is an essential component of how we build our model of the world. In this doctoral thesis we introduce a generic a-priori assessment of each possible intervention, in order to select the most cost-effective interventions only, and avoid unnecessary systematic experimentation on the real world. Based on this a-priori assessment, we propose an active learning algorithm that identifies the causal relations in any given causal model, using a least cost sequence of interventions. There are several novel aspects introduced by our algorithm. It is, in most case scenarios, able to discard many causal model candidates using relatively inexpensive interventions that only test one value of the intervened variables. Also, the number of interventions performed by the algorithm can be bounded by the number of causal model candidates. Hence, fewer initial candidates (or equivalently, more prior knowledge) lead to fewer interventions for causal discovery. Causality is intimately related to time, as causes appear to precede their effects. Cyclical causal processes are a very interesting case of causality in relation to time. In this doctoral thesis we introduce a formal analysis of time cyclical causal settings by defining a causal analog to the purely observational Dynamic Bayesian Networks, and provide a sound and complete algorithm for the identification of causal effects in the cyclic setting. We introduce the existence of two types of hidden confounder variables in this framework, which affect in substantially different ways the identification procedures, a distinction with no analog in either Dynamic Bayesian Networks or standard causal graphs.
Artificial Intelligence
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the discovery and prediction of causal relationships. Specifically, the paper raises the following issues: 1. **Measurement of Intervention Efficiency**: How can we effectively measure the efficiency of an intervention in discovering causal relationships? The paper introduces a method to evaluate the effectiveness of each possible intervention, thereby selecting the most cost - effective intervention measures and avoiding unnecessary system experiments. 2. **Active Learning Algorithm**: How can we design an active learning algorithm that can use the least - cost sequence of interventions to identify causal relationships in a given causal model? The paper proposes an active learning algorithm based on prior evaluation, which can rule out many candidate causal models through relatively inexpensive interventions in most cases. 3. **Time - Recursive Causal Processes**: How can we identify causal relationships in time - recursive causal processes? The paper introduces the causal analogy of dynamic Bayesian networks and provides a complete algorithm to identify causal effects in a cyclic setting. In addition, the paper also discusses the influence of two types of hidden confounding variables on the identification procedure. 4. **Prediction of Causal Effects in Dynamic Causal Networks**: How can we identify causal effects in dynamic causal networks in the presence of static and dynamic hidden confounding variables? The paper proposes algorithms for these two cases and discusses the issue of non - identifiability. In summary, the paper mainly focuses on how to discover and predict causal relationships through effective interventions and algorithms, especially in dynamic systems and in the presence of hidden confounding variables. The solution of these problems is of great significance for understanding the causal mechanisms of complex systems.