Causal Inference on Time Series using Structural Equation Models

Jonas Peters,Dominik Janzing,Bernhard Schölkopf
DOI: https://doi.org/10.48550/arXiv.1207.5136
2012-07-21
Abstract:Causal inference uses observations to infer the causal structure of the data generating system. We study a class of functional models that we call Time Series Models with Independent Noise (TiMINo). These models require independent residual time series, whereas traditional methods like Granger causality exploit the variance of residuals. There are two main contributions: (1) Theoretical: By restricting the model class (e.g. to additive noise) we can provide a more general identifiability result than existing ones. This result incorporates lagged and instantaneous effects that can be nonlinear and do not need to be faithful, and non-instantaneous feedbacks between the time series. (2) Practical: If there are no feedback loops between time series, we propose an algorithm based on non-linear independence tests of time series. When the data are causally insufficient, or the data generating process does not satisfy the model assumptions, this algorithm may still give partial results, but mostly avoids incorrect answers. An extension to (non-instantaneous) feedbacks is possible, but not discussed. It outperforms existing methods on artificial and real data. Code can be provided upon request.
Machine Learning,Methodology
What problem does this paper attempt to address?
This paper aims to solve the problem of causal inference in time - series data. Specifically, the author proposes a new model class named TiMINo (Time Series Models with Independent Noise), and the corresponding algorithm TiMINo causality, which is used to infer the causal structure between time series from observational data. The following are the key problems that the paper attempts to solve: 1. **Limitations of existing methods**: - **Instantaneous effect**: Existing causal inference methods such as the Granger causality test cannot handle instantaneous effects (i.e., the mutual influence of two variables at the same time point). For example, when \(X_t\) affects \(Y_t\), including either time series will be helpful in predicting the other time series, so the Granger causality will wrongly infer \(X \rightarrow Y\) and \(Y \rightarrow X\). - **Confounding factors**: When there are unobserved common causes (confounding factors), existing methods may fail. For example, if there is a confounding factor between \(X_t\) and \(Y_{t + 1}\), then conditioning on any observed variable cannot block the path between them, leading to the Granger causality wrongly inferring \(X \rightarrow Y\). - **Improper model assumptions**: Existing methods are usually based on simple linear models for conditional independence testing. If the actual data does not conform to these assumptions, wrong causal conclusions may be drawn. 2. **Contributions of the TiMINo model**: - **Theoretical contributions**: By restricting the model class (e.g., additive noise models), the author can provide more general identifiability results than existing ones. This result includes lag effects and instantaneous effects, can be nonlinear, does not need to satisfy the faithfulness assumption, and can handle non - immediate feedback between time series. - **Practical contributions**: If there are no feedback loops between time series, the author proposes an algorithm based on nonlinear independence testing. Even if the data does not satisfy causal sufficiency or the generating process does not satisfy the model assumptions, this algorithm can still give partial results, but mainly avoids wrong causal conclusions. Moreover, this algorithm outperforms existing methods on both artificial and real data. 3. **Specific problems**: - **How to recover the causal structure from a finite sample**: The author proposes an algorithm (TiMINo causality) that can recover the model structure from a finite sample. This algorithm is applicable to a wider class of models and can use any provided time - series fitting algorithm. - **How to handle data that does not satisfy model assumptions**: When the data does not satisfy the assumptions, the TiMINo causality algorithm mainly remains in an uncertain state rather than drawing wrong causal conclusions. - **How to extend to non - immediate feedback**: Although not discussed in detail in the paper, the author points out that it can be extended to the case of non - immediate feedback. In conclusion, by introducing the TiMINo model and the corresponding algorithm, this paper solves the limitations of existing causal inference methods in dealing with time - series data and provides a more general and reliable method for inferring causal structures.