Causal Discovery with Fewer Conditional Independence Tests

Kirankumar Shiragur,Jiaqi Zhang,Caroline Uhler
2024-06-04
Abstract:Many questions in science center around the fundamental problem of understanding causal relationships. However, most constraint-based causal discovery algorithms, including the well-celebrated PC algorithm, often incur an exponential number of conditional independence (CI) tests, posing limitations in various applications. Addressing this, our work focuses on characterizing what can be learned about the underlying causal graph with a reduced number of CI tests. We show that it is possible to a learn a coarser representation of the hidden causal graph with a polynomial number of tests. This coarser representation, named Causal Consistent Partition Graph (CCPG), comprises of a partition of the vertices and a directed graph defined over its components. CCPG satisfies consistency of orientations and additional constraints which favor finer partitions. Furthermore, it reduces to the underlying causal graph when the causal graph is identifiable. As a consequence, our results offer the first efficient algorithm for recovering the true causal graph with a polynomial number of tests, in special cases where the causal graph is fully identifiable through observational data and potentially additional interventions.
Machine Learning,Artificial Intelligence,Methodology
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve an important problem in causal discovery: **How can we still effectively infer the underlying causal graph while reducing the number of Conditional Independence (CI) tests?** Specifically, many existing constraint - based causal discovery algorithms (such as the well - known PC algorithm) usually require an exponential number of CI tests, which brings great limitations in practical applications. Therefore, the focus of this paper is to study **what information about the underlying causal graph can be learned under the condition of fewer CI tests**. ### Main contributions of the paper 1. **Proposed a new representation method - Causally Consistent Partition Graph (CCPG)**: - CCPG consists of a partition of vertices and a directed acyclic graph (DAG) defined on its components. - This representation method can recover a rough representation of the causal graph under a polynomial number of CI tests, and in some cases, can fully recover the true causal graph. 2. **Provided an efficient algorithm**: - This algorithm uses a polynomial number of CI tests to recover the CCPG representation of the causal graph. - In special cases (for example, when the causal graph can be fully identified from observational data), this algorithm can recover the true causal graph in polynomial time. 3. **Extended the results to consider intervention data**: - Studied the role of intervention data in the structure learning process. - Provided an algorithm that can recover the true causal graph when given sufficient intervention data. ### Formulas and concepts - **Conditional Independence (CI) test**: Used to determine whether two variables are independent given other variables. \[ A \perp B \mid C \] indicates that given the set \( C \), the sets \( A \) and \( B \) are conditionally independent. - **v - structure**: Refers to three different vertices \( u, v, w \), where \( u \rightarrow v \leftarrow w \) and \( u \) and \( w \) are not adjacent. - **Covered edge**: If \( Pa[u] = Pa(v) \), then the edge \( u \rightarrow v \) is a covered edge. ### Conclusion By introducing the CCPG representation method and an efficient algorithm, this paper successfully reduces the number of CI tests required in the causal discovery process, thereby improving the feasibility of the algorithm in practical applications. In addition, this study also shows that in some cases, causal relationships can be accurately recovered even without a large number of CI tests.