Abstract:Structural causal models postulate noisy functional relations among a set of interacting variables. The causal structure underlying each such model is naturally represented by a directed graph whose edges indicate for each variable which other variables it causally depends upon. Under a number of different model assumptions, it has been shown that this causal graph and, thus also, causal effects are identifiable from mere observational data. For these models, practical algorithms have been devised to learn the graph. Moreover, when the graph is known, standard techniques may be used to give estimates and confidence intervals for causal effects. We argue, however, that a two-step method that first learns a graph and then treats the graph as known yields confidence intervals that are overly optimistic and can drastically fail to account for the uncertain causal structure. To address this issue we lay out a framework based on test inversion that allows us to give confidence regions for total causal effects that capture both sources of uncertainty: causal structure and numerical size of nonzero effects. Our ideas are developed in the context of bivariate linear causal models with homoscedastic errors, but as we exemplify they are generalizable to larger systems as well as other settings such as, in particular, linear non-Gaussian models.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to construct confidence intervals that can simultaneously consider the uncertainty of causal structure and the magnitude of causal effect in causal discovery. Specifically, the traditional two - step method first learns the causal graph from data, and then estimates the causal effect and constructs the confidence interval based on the known causal graph. This method is often too optimistic and ignores the uncertainty generated in the data - driven model selection process. Therefore, the paper proposes a new framework to construct confidence intervals for total causal effects through the test - inversion method, and these intervals can capture the uncertainties of both causal structure and non - zero effect magnitude simultaneously.
### Background and Motivation of the Paper
In the field of causal discovery, structural causal models (SCMs) are a commonly used method, which represents the causal relationships between variables through directed graphs. Under certain assumptions, the causal graph and its causal effects can be identified from observational data. However, existing methods usually adopt a two - step method: first learn the causal graph, and then estimate the causal effect based on the known graph. This approach ignores the uncertainty of the causal structure, resulting in overly optimistic confidence intervals that cannot accurately reflect the true level of uncertainty.
### Main Contributions of the Paper
1. **Proposing a New Framework**: The paper proposes a method based on test - inversion that can consider the uncertainties of both causal structure and causal effect magnitude when constructing confidence intervals.
2. **Theoretical Basis**: The paper discusses in detail the construction methods of confidence intervals in linear causal models, especially bivariate linear causal models, and shows how to generalize these methods to larger systems and other model classes.
3. **Experimental Verification**: Through simulation experiments, the paper verifies the effectiveness of the proposed method and compares it with other methods (such as the bootstrap method), showing the advantages of the new method in terms of coverage probability and confidence interval width.
### Specific Methods
- **Likelihood Ratio Test**: The paper proposes two methods based on the likelihood ratio test (LRT1 and LRT2) for constructing confidence intervals. These methods determine the boundaries of the confidence intervals by testing specific hypotheses.
- **Split Likelihood Ratio Test**: The paper also introduces the split likelihood ratio test (SLRT), which is a conservative but effective finite - sample method, especially suitable for the case of irregular composite hypotheses.
- **Heuristic Methods**: To simplify the calculation, the paper also proposes some heuristic methods (such as estSLRT), which perform well in practical applications.
### Experimental Results
- **Coverage Probability**: All proposed methods can achieve the expected coverage probability (95%) under different sample sizes and causal effect magnitudes.
- **Confidence Interval Width**: Although the split likelihood ratio methods (SLRT and estSLRT) are more conservative, their confidence interval widths are larger. The heuristic method estSLRT generates narrower confidence intervals while maintaining the coverage probability.
- **Zero - effect Exclusion Ability**: As the sample size increases, all methods can more effectively exclude the possibility of zero effect, indicating that they can not only correctly construct confidence intervals but also effectively judge whether the causal effect is zero.
### Conclusion
The paper successfully solves the problem of how to construct confidence intervals that simultaneously consider the uncertainties of causal structure and causal effect magnitude in causal discovery. By proposing a new test - inversion framework and specific likelihood ratio test methods, the paper provides a more reliable and accurate method for statistical inference of causal effects.