Abstract:Inferring causal links or subgraphs corresponding to a specific phenotype or label based solely on measured data is an important yet challenging task, which is also different from inferring causal nodes. While Graph Neural Network (GNN) Explainers have shown potential in subgraph identification, existing methods with GNN often offer associative rather than causal insights. This lack of transparency and explainability hinders our understanding of their results and also underlying mechanisms. To address this issue, we propose a novel method of causal link/subgraph inference, called CIDER: Counterfactual-Invariant Diffusion-based GNN ExplaineR, by implementing both counterfactual and diffusion implementations. In other words, it is a model-agnostic and task-agnostic framework for generating causal explanations based on a counterfactual-invariant and diffusion process, which provides not only causal subgraphs due to counterfactual implementation but reliable causal links due to the diffusion process. Specifically, CIDER is first formulated as an inference task that generatively provides the two distributions of one causal subgraph and another spurious subgraph. Then, to enhance the reliability, we further model the CIDER framework as a diffusion process. Thus, using the causal subgraph distribution, we can explicitly quantify the contribution of each subgraph to a phenotype/label in a counterfactual manner, representing each subgraph's causal strength. From a causality perspective, CIDER is an interventional causal method, different from traditional association studies or observational causal approaches, and can also reduce the effects of unobserved confounders. We evaluate CIDER on both synthetic and real-world datasets, which all demonstrate the superiority of CIDER over state-of-the-art methods.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to infer the causal subgraph related to a specific phenotype or label based on only the measured data. This task is different from inferring causal nodes, and among the existing methods, most graph neural network (GNN) interpreters only provide insights into correlation rather than causality. This lack of transparency and interpretability hinders our understanding of the results and their underlying mechanisms. Specifically, this paper proposes solutions to the following problems: 1. **Limitations of existing GNN interpreters**: Existing GNN interpreters usually can only provide correlation explanations rather than causal explanations. This means that they can point out which subgraphs are related to a specific task, but cannot clearly determine whether these subgraphs actually affect the label or phenotype of the sample. 2. **Impact of unobserved confounders**: Traditional methods are difficult to deal with unobserved confounders, which may lead to incorrect causal inferences. 3. **Improving the reliability and accuracy of causal inferences**: A new method is needed to improve the reliability of causal inferences and be able to quantify the causal contribution of each subgraph to the label. To solve these problems, the paper proposes a new framework named CIDER (Counterfactual - Invariant Diffusion - based GNN Explainer for Causal Subgraph Inference). CIDER aims to directly infer the causal subgraph and its causal strength on the label from high - dimensional measurement data by combining counterfactual invariance and the diffusion process. Specifically, CIDER achieves the following goals: - **Generate the distributions of causal subgraphs and spurious subgraphs**: By dividing the entire graph into causal subgraphs and spurious subgraphs, CIDER can directly predict the distributions of these two subgraphs. - **Enhance the reasoning ability using the diffusion process**: By denoising and refining the spurious subgraphs in each step of the diffusion process, CIDER can gradually converge to a more reliable distribution of causal subgraphs during the training process. - **Reduce the impact of unobserved confounders**: As an interventional causal method, CIDER can reduce the impact of unobserved confounders on causal inferences. In summary, the main purpose of this paper is to develop a new method that can accurately identify causal subgraphs from data and explain their impact on labels or phenotypes, thereby overcoming the limitations of existing GNN interpreters and improving the reliability and accuracy of causal inferences.

CIDER: Counterfactual-Invariant Diffusion-based GNN Explainer for Causal Subgraph Inference

Graph Neural Network Causal Explanation via Neural Causal Models

G-Censor: Graph Contrastive Learning with Task-Oriented Counterfactual Views

Towards Human-like Perception: Learning Structural Causal Model in Heterogeneous Graph

Incorporating Retrieval-based Causal Learning with Information Bottlenecks for Interpretable Graph Neural Networks

Causal GNNs: A GNN-Driven Instrumental Variable Approach for Causal Inference in Networks

Reinforced Causal Explainer for Graph Neural Networks.

SIG: Efficient Self-Interpretable Graph Neural Network for Continuous-time Dynamic Graphs

OrphicX: A Causality-Inspired Latent Variable Model for Interpreting Graph Neural Networks

Causality-based CTR Prediction using Graph Neural Networks

Cooperative Explanations of Graph Neural Networks.

Rethinking Causal Relationships Learning in Graph Neural Networks

Heterophilic Graph Neural Networks Optimization with Causal Message-passing

Invariant Graph Learning for Causal Effect Estimation

Global Graph Counterfactual Explanation: A Subgraph Mapping Approach

When Graph Neural Network Meets Causality: Opportunities, Methodologies and An Outlook

GreeDy and CoDy: Counterfactual Explainers for Dynamic Graphs

CI-GNN: A Granger causality-inspired graph neural network for interpretable brain network-based psychiatric diagnosis

Global Counterfactual Explainer for Graph Neural Networks

Towards Faithful and Consistent Explanations for Graph Neural Networks