Discovery and inference of possibly bi-directional causal relationships with invalid instrumental variables

Wei Li,Rui Duan,Sai Li
DOI: https://doi.org/10.48550/arXiv.2407.11646
2024-07-16
Abstract:Learning causal relationships between pairs of complex traits from observational studies is of great interest across various scientific domains. However, most existing methods assume the absence of unmeasured confounding and restrict causal relationships between two traits to be uni-directional, which may be violated in real-world systems. In this paper, we address the challenge of causal discovery and effect inference for two traits while accounting for unmeasured confounding and potential feedback loops. By leveraging possibly invalid instrumental variables, we provide identification conditions for causal parameters in a model that allows for bi-directional relationships, and we also establish identifiability of the causal direction under the introduced conditions. Then we propose a data-driven procedure to detect the causal direction and provide inference results about causal effects along the identified direction. We show that our method consistently recovers the true direction and produces valid confidence intervals for the causal effect. We conduct extensive simulation studies to show that our proposal outperforms existing methods. We finally apply our method to analyze real data sets from UK Biobank.
Methodology
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how to discover and infer the causal relationship between two complex features from observational data in the presence of unmeasured confounding factors and potential bidirectional causal relationships. Specifically, the paper focuses on the following key points: 1. **The influence of unmeasured confounding factors**: Most existing causal discovery methods assume that there are no unmeasured confounding factors, but this assumption is often not valid in practical applications. Therefore, the paper proposes a method that can handle unmeasured confounding factors. 2. **Bidirectional causal relationships**: Traditional methods usually assume that the causal relationship between two features is unidirectional, which may be overly simplified in many real - world scenarios. The paper proposes a model that allows bidirectional causal relationships and provides conditions for identifying causal parameters in this case. 3. **The use of invalid instrumental variables**: The instrumental variable method is a commonly used method to reduce confounding bias, but its effectiveness depends on the effectiveness of the instrumental variables. The paper explores how to conduct causal discovery and effect inference in the case where invalid instrumental variables may exist. ### Main contributions of the paper 1. **Identification conditions**: The paper proposes two methods for identifying causal parameters and directions. The first method assumes that the majority rule is satisfied in the known direction and that the error in the other direction satisfies the covariance heterogeneity condition, proving the identifiability of causal parameters in the bidirectional model. The second method considers a more realistic situation, that is, not knowing which direction satisfies the augmented majority rule. By introducing the augmented majority rule and the covariance heterogeneity condition, the identification results of causal parameters and directions are established. 2. **Data - driven method**: The paper develops a data - driven method called PCH (Plurality - then - Covariance - Heterogeneity) for inferring causal relationships and providing confidence intervals for causal effects. The PCH method combines the pattern - based method and the covariance - heterogeneity - based method and can consistently recover the causal direction and generate valid confidence intervals in the presence of unmeasured confounding factors and potential bidirectional causal relationships. 3. **Theoretical and empirical analysis**: The paper verifies the effectiveness of the proposed method through extensive simulation studies and applies it to the actual data set of the UK Biobank, demonstrating the superior performance of this method in practical applications. ### Conclusion The paper proposes a comprehensive solution that can discover and infer the causal relationship between two complex features from observational data in the presence of unmeasured confounding factors and potential bidirectional causal relationships. This method is not only of great theoretical significance but also performs well in practical applications, providing new tools and ideas for causal inference research.