Testing Conditional Independence Between Latent Variables by Independence Residuals

Zhengming Chen,Jie Qiao,Feng Xie,Ruichu Cai,Zhifeng Hao,Keli Zhang
DOI: https://doi.org/10.1109/tnnls.2024.3368561
IF: 14.255
2024-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:Conditional independence (CI) testing is an important problem, especially in causal discovery. Most testing methods assume that all variables are fully observable and then test the CI among the observed data. Such an assumption is often untenable beyond applications dealing with, e.g., psychological analysis about the mental health status and medical diagnosing (researchers need to consider the existence of latent variables in these scenarios); and typically adopted latent CI test schemes mainly suffer from robust or efficient issues. Accordingly, this article investigates the problem of testing CI between latent variables. To this end, we offer an auxiliary regression-based CI (AReCI) test by taking the measured variable as the surrogate variable of the latent variables to conduct the regression over the latent variables under the linear causal models, in which each latent variable has some certain measured variables. Specifically, given a pair of latent variables L<sub>X</sub> and L<sub>Y</sub> , and a corresponding latent variable set L<sub>O</sub> , [Formula: see text] holds if and only if [Formula: see text] and [Formula: see text] are statistically independent, where A<sup>'</sup> and A<sup>''</sup> are the two disjoint subset of the measured variable for the corresponding latent variables, A<sup>'</sup><sub>{L<sub>O</sub>}</sub> ∩A<sup>''</sup><sub>{L<sub>O</sub>}</sub> = ∅ , and ω<sub>1</sub> is a parameter vector characterized from the cross covariance between A<sub>{L<sub>X</sub>}</sub> and A<sup>'</sup><sub>{L<sub>O</sub>}</sub> , and ω<sub>2</sub> is a parameter vector characterized from the cross covariance between A<sub>{L<sub>Y</sub>}</sub> and A<sup>''</sup><sub>{L<sub>O</sub>}</sub> . We theoretically show that the AReCI test is capable of addressing both Gaussian and non-Gaussian data. In addition, we find that the well-known partial correlation test can be seen as a special case of the AReCI test. Finally, we devise a causal discovery method by using the AReCI test as the CI test. The experimental results on synthetic and real-world data illustrate the effectiveness of our method.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?