DeCaf: A Causal Decoupling Framework for OOD Generalization on Node Classification

Xiaoxue Han,Huzefa Rangwala,Yue Ning
2024-10-27
Abstract:Graph Neural Networks (GNNs) are susceptible to distribution shifts, creating vulnerability and security issues in critical domains. There is a pressing need to enhance the generalizability of GNNs on out-of-distribution (OOD) test data. Existing methods that target learning an invariant (feature, structure)-label mapping often depend on oversimplified assumptions about the data generation process, which do not adequately reflect the actual dynamics of distribution shifts in graphs. In this paper, we introduce a more realistic graph data generation model using Structural Causal Models (SCMs), allowing us to redefine distribution shifts by pinpointing their origins within the generation process. Building on this, we propose a casual decoupling framework, DeCaf, that independently learns unbiased feature-label and structure-label mappings. We provide a detailed theoretical framework that shows how our approach can effectively mitigate the impact of various distribution shifts. We evaluate DeCaf across both real-world and synthetic datasets that demonstrate different patterns of shifts, confirming its efficacy in enhancing the generalizability of GNNs.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to address the generalization ability problem of Graph Neural Networks (GNNs) when dealing with Out - of - Distribution (OOD) data. Specifically, the paper points out that current GNNs perform poorly in the face of distribution changes (such as data from different geographical regions, domains, or time periods), leading to a decline in the performance of the model on test data. To solve this problem, the paper proposes a new causal decoupling framework - DeCaf. ### Core Problems of the Paper 1. **Out - of - Distribution Generalization Problem**: - When the training and test data distributions of GNNs are inconsistent, they are prone to learn incorrect feature - structure - label mapping relationships, resulting in poor performance on the test set. - Existing methods usually rely on overly simplified assumptions and cannot fully reflect the distribution change dynamics in reality. 2. **Limitations of Existing Methods**: - Existing methods usually assume the existence of an invariant feature - label mapping relationship, but these methods fail to distinguish between feature and structure changes and ignore their complex interactions. - In fact, feature and structure changes may be independent and have different impacts on labels. ### Solution: DeCaf Framework To overcome the above problems, the paper introduces the following innovations: 1. **New Graph Generation Model**: - Use Structural Causal Models (SCMs) to re - define the sources of distribution changes and more realistically simulate the graph data generation process. - Through SCMs, the feature - label and structure - label relationships can be separated independently, thereby more accurately capturing the essence of distribution changes. 2. **Causal Decoupling Framework**: - The DeCaf framework aims to independently learn unbiased feature - label and structure - label mapping relationships. - By regarding the influence as a causal effect and using a causal estimation model to consider confounding factors, the unbiasedness of the estimation is ensured. 3. **Theoretical Analysis and Practical Verification**: - Provide a detailed theoretical framework to prove that this method can effectively mitigate the effects of various distribution changes. - Conduct experiments on multiple real - world and synthetic datasets to verify the effectiveness of DeCaf in enhancing the generalization ability of GNNs. ### Formula Representation The formulas involved in the paper are represented in Markdown format as follows: - Feature Generation Formulas: \[ x_i = M_f z_i + b_f \] \[ y_i = M_y z_i + b_y \] \[ p(A_{ij} = 1) = c \cdot \left( \frac{\|M_s z_i - M_o z_j\|_2^2}{2} + 1 \right)^{-1} \] - Causal Effect Estimation Formulas: \[ \Psi(x, a) = \gamma \cdot \Psi_x(x) + (1 - \gamma) \cdot \Psi_a(a) \] \[ \tau(a', a, x) = E[h_y | C = x, do(T = a)] - E[h_y | C = x, do(T = a')] \] \[ \tau(x', x, a) = E[h_y | C = a, do(T = x)] - E[h_y | C = a, do(T = x')] \] Through these innovations, the DeCaf framework significantly improves the generalization ability of GNNs on out - of - distribution data, providing new ideas and methods for solving OOD problems in the real world.