Identification of Causal Structure with Latent Variables Based on Higher Order Cumulants

Wei Chen,Zhiyi Huang,Ruichu Cai,Zhifeng Hao,Kun Zhang
2023-12-19
Abstract:Causal discovery with latent variables is a crucial but challenging task. Despite the emergence of numerous methods aimed at addressing this challenge, they are not fully identified to the structure that two observed variables are influenced by one latent variable and there might be a directed edge in between. Interestingly, we notice that this structure can be identified through the utilization of higher-order cumulants. By leveraging the higher-order cumulants of non-Gaussian data, we provide an analytical solution for estimating the causal coefficients or their ratios. With the estimated (ratios of) causal coefficients, we propose a novel approach to identify the existence of a causal edge between two observed variables subject to latent variable influence. In case when such a causal edge exits, we introduce an asymmetry criterion to determine the causal direction. The experimental results demonstrate the effectiveness of our proposed method.
Machine Learning,Artificial Intelligence,Methodology
What problem does this paper attempt to address?
The paper primarily aims to address the issue of latent variables in causal discovery, particularly how to identify whether there is a causal relationship between two observed variables and determine its direction when both are influenced by the same latent variable. ### Research Background and Objectives - **Background**: Causal discovery is a significant topic in fields such as statistics and machine learning, especially in the presence of latent variables. Although many methods have been proposed, they often fail to fully identify the specific structure when two observed variables are influenced by a latent variable, i.e., they cannot determine whether there is a direct causal relationship between these two variables. - **Objectives**: The paper aims to solve the above problem by utilizing higher-order cumulants, specifically: - Detecting whether there is a causal edge between two observed variables. - If a causal edge exists, further determining its direction. ### Main Contributions - **Theoretical Foundation**: The authors propose an analytical solution based on higher-order cumulants of non-Gaussian data to estimate causal coefficients or their ratios. Based on this, they design a new method to identify whether there is a causal edge between two observed variables influenced by a latent variable. If a causal edge exists, an asymmetry criterion is introduced to determine the causal direction. - **Technical Means**: Using specific combinations of higher-order joint cumulants to detect whether there is a causal relationship between observed variables. By comparing the relationships between cumulants of different orders, the existence of a causal edge can be determined. - **Causal Direction Determination**: By analyzing third-order cumulants, an asymmetry criterion can be developed to determine the causal direction between two observed variables. This requires estimating the parameters of the shared component to reveal the causal relationship. - **Empirical Study**: The effectiveness of the proposed method is validated through simulation experiments. Compared with other benchmark methods (such as LiNGAM, ANM, LvLiNGAM, etc.) under different sample sizes, the results show that the proposed method based on higher-order cumulants performs excellently in detecting the existence of causal edges and determining the causal direction. In summary, the main goal of this paper is to solve the problem of causal relationship identification when two observed variables are influenced by the same latent variable, and to demonstrate the effectiveness and practicality of the proposed method through theoretical analysis and empirical research.