Causal Inference in Geosciences with Kernel Sensitivity Maps

Adrián Pérez-Suay,Gustau Camps-Valls
DOI: https://doi.org/10.48550/arXiv.2012.14303
2020-12-08
Abstract:Establishing causal relations between random variables from observational data is perhaps the most important challenge in today's Science. In remote sensing and geosciences this is of special relevance to better understand the Earth's system and the complex and elusive interactions between processes. In this paper we explore a framework to derive cause-effect relations from pairs of variables via regression and dependence estimation. We propose to focus on the sensitivity (curvature) of the dependence estimator to account for the asymmetry of the forward and inverse densities of approximation residuals. Results in a large collection of 28 geoscience causal inference problems demonstrate the good capabilities of the method.
Machine Learning,Signal Processing,Methodology
What problem does this paper attempt to address?
This paper attempts to solve the problem of establishing causal relationships between random variables from observational data, especially in the fields of earth science and remote sensing. Specifically, the author aims to derive causal relationships between pairs of variables through regression and dependence estimation, and proposes a new method based on Kernel Sensitivity Maps to better capture the asymmetry of forward and reverse densities. ### Core Problems of the Paper 1. **Challenges in Causal Inference**: - Determining causal relationships between random variables from observational data is an important challenge in current scientific research. - In the fields of earth science and remote sensing, it is crucial to understand the earth system and its complex and elusive process interactions. 2. **Limitations of Existing Methods**: - Existing causal inference methods such as Granger causality analysis, constraint - based search, and conditional independence measures have limitations when dealing with earth system data. - Laboratory experiments cannot conduct randomized experiments on the earth system, so it is necessary to rely on observational data for causal inference. ### Proposed Method - **Regression and Dependence Estimation Framework**: - Fit variable pairs using a nonlinear regression model (such as a Gaussian process) and evaluate the independence of forward and reverse residuals. - Use the Hilbert - Schmidt Independence Criterion (HSIC) as a dependence measure. - **Kernel Sensitivity Maps**: - Propose a new method based on HSIC sensitivity. Generate sensitivity maps by calculating the derivatives of HSIC with respect to input samples and features. - Sensitivity maps can reveal which features and samples have the greatest impact on dependence estimation, thereby helping to determine the causal direction. ### Formula Representation - **HSIC Formula**: \[ \text{HSIC}(F, G, P_{xy})=\frac{1}{n^{2}}\text{Tr}(H K_x H K_y) \] where \( K_x \) and \( K_y \) are the kernel matrices of the input random variables \( x \) and \( y \) respectively, and \( H = I-\frac{1}{n}11^{\top} \) is the centering matrix. - **Sensitivity Map Formula**: \[ S^x_{ij}=\frac{\partial \text{HSIC}}{\partial X_{ij}} = -\frac{2}{\sigma^{2}n^{2}}\text{Tr}(H K_y H (K_x\circ M_j)) \] \[ S^y_{ij}=\frac{\partial \text{HSIC}}{\partial Y_{ij}} = -\frac{2}{\sigma^{2}n^{2}}\text{Tr}(H K_x H (K_y\circ N_j)) \] where \( M_j \) and \( N_j \) are the matrices corresponding to the \( j \) - th feature respectively, and the symbol \( \circ \) represents the Hadamard product. ### Results and Conclusions - **Experimental Results**: - Experiments were carried out on datasets of 28 earth science causal inference problems. The results show that the proposed sensitivity - based causal inference criterion \( \hat{C}_s \) outperforms the traditional HSIC method. - The ROC curve and AUC values indicate that the new method performs better under different sample sizes. - **Conclusions**: - This study provides a new causal inference method based on observational data, which is especially suitable for the fields of earth science and remote sensing. - By focusing on the sensitivity of the dependence estimator, the asymmetry in causal relationships can be better captured, improving the accuracy and robustness of causal inference.