Reconciling Causality and Statistics

Pirmin Lemberger,Denis Oblin
DOI: https://doi.org/10.48550/arXiv.2007.03940
2020-09-24
Abstract:Statisticians have warned us since the early days of their discipline that experimental correlation between two observations by no means implies the existence of a causal relation. The question about what clues exist in observational data that could informs us about the existence of such causal relations is nevertheless more that legitimate. It lies actually at the root of any scientific endeavor. For decades however the only accepted method among statisticians to elucidate causal relationships was the so called Randomized Controlled Trial. Besides this notorious exception causality questions remained largely taboo for many. One reason for this state of affairs was the lack of an appropriate mathematical framework to formulate such questions in an unambiguous way. Fortunately thinks have changed these last years with the advent of the so called Causality Revolution initiated by Judea Pearl and coworkers. The aim of this pedagogical paper is to present their ideas and methods in a compact and self-contained fashion with concrete business examples as illustrations.
Artificial Intelligence,Statistics Theory
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to identify causal relationships from observational data. Traditionally, statistics has always warned us that the correlation between two observations does not imply the existence of a causal relationship. However, how to find clues in observational data that can suggest the existence of a causal relationship is a very important problem, which is actually the foundation of all scientific research. For decades, the only method recognized by statisticians to clarify causal relationships was the randomized controlled trial (RCT). But apart from this well - known exception, the issue of causal relationships has remained largely a taboo topic, mainly because of the lack of an appropriate mathematical framework to formulate these problems explicitly. With the "causal revolution" initiated by Judea Pearl and others, this situation has changed. Pearl provided a mathematically rigorous framework that not only allows causal relationship questions to be posed in an explicit manner, but also systematically answers these questions when possible. This educational paper aims to introduce Pearl's and his colleagues' ideas and methods in a compact and self - contained way, and uses specific business cases for illustration. Specifically, the paper focuses on the following aspects: 1. **Causality and statistical correlation**: Explain why statistical analysis alone cannot fully handle all aspects of causal relationships. 2. **Randomized controlled trial (RCT)**: Discuss the importance of RCT in clinical trials and point out its limitations in causal inference. 3. **Causal graph models and interventions**: Introduce how to use causal graph models (such as Bayesian networks) and intervention operators (do - operator) to identify and calculate causal effects. 4. **d - separation criterion**: Describe how to use the d - separation criterion to judge the conditional independence between variables, thereby identifying causal paths. 5. **Back - door criterion and front - door criterion**: Introduce how to use these graphical criteria to identify the identifiability of causal effects in specific situations. 6. **Inference of causal graphs**: Discuss how to infer causal graphs from raw data in the absence of prior causal graphs. In summary, this paper aims to provide a comprehensive framework to help readers understand how to perform causal reasoning based on observational data, especially to identify and calculate causal effects in complex causal networks.