Unifying Invariant and Variant Features for Graph Out-of-Distribution via Probability of Necessity and Sufficiency

Xuexin Chen,Ruichu Cai,Kaitao Zheng,Zhifan Jiang,Zhengting Huang,Zhifeng Hao,Zijian Li

2024-07-22

Abstract:Graph Out-of-Distribution (OOD), requiring that models trained on biased data generalize to the unseen test data, has considerable real-world applications. One of the most mainstream methods is to extract the invariant subgraph by aligning the original and augmented data with the help of environment augmentation. However, these solutions might lead to the loss or redundancy of semantic subgraphs and result in suboptimal generalization. To address this challenge, we propose exploiting Probability of Necessity and Sufficiency (PNS) to extract sufficient and necessary invariant substructures. Beyond that, we further leverage the domain variant subgraphs related to the labels to boost the generalization performance in an ensemble manner. Specifically, we first consider the data generation process for graph data. Under mild conditions, we show that the sufficient and necessary invariant subgraph can be extracted by minimizing an upper bound, built on the theoretical advance of the probability of necessity and sufficiency. To further bridge the theory and algorithm, we devise the model called Sufficiency and Necessity Inspired Graph Learning (SNIGL), which ensembles an invariant subgraph classifier on top of latent sufficient and necessary invariant subgraphs, and a domain variant subgraph classifier specific to the test domain for generalization enhancement. Experimental results demonstrate that our SNIGL model outperforms the state-of-the-art techniques on six public benchmarks, highlighting its effectiveness in real-world scenarios.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

The paper primarily addresses the Out-of-Distribution (OOD) generalization challenges encountered by Graph Neural Networks (GNNs) when processing graph data. Specifically, the research aims to solve the following issues: 1. **Extracting Optimal Invariant Subgraphs**: Existing methods extract invariant features through environmental augmentation to achieve domain generalization. However, these methods often struggle to find a balance, achieving an optimal trade-off between invariance alignment and prediction accuracy. This can lead to the loss or redundancy of semantic subgraphs, thereby affecting generalization performance. 2. **Utilizing Necessary and Sufficient Invariant Substructures**: The paper proposes a new framework that uses the Probability of Necessity and Sufficiency (PNS) to extract sufficient and necessary invariant substructures to overcome the aforementioned issues. This approach can better capture invariant features that are crucial for prediction. 3. **Incorporating Domain-Variant Features**: To further improve the model's performance on unseen data, the paper also considers domain-variant subgraphs related to the labels and integrates them with invariant subgraphs to enhance generalization capability. In summary, the core contribution of this paper is the proposal of a method named Sufficiency and Necessity Inspired Graph Learning (SNIGL). This method effectively extracts necessary and sufficient invariant subgraph features from training data and combines them with domain-specific variant features, thereby achieving significant improvements in domain generalization tasks for graph data. Experimental results show that SNIGL outperforms the current state-of-the-art techniques on 6 public benchmark datasets, demonstrating its effectiveness in practical applications.

Unifying Invariant and Variant Features for Graph Out-of-Distribution via Probability of Necessity and Sufficiency

Unifying Invariance and Spuriousity for Graph Out-of-Distribution via Probability of Necessity and Sufficiency

Invariant Graph Learning Meets Information Bottleneck for Out-of-Distribution Generalization

DIVE: Subgraph Disagreement for Graph Out-of-Distribution Generalization

Graph Invariant Learning with Subgraph Co-mixup for Out-Of-Distribution Generalization

Improving Graph Out-of-distribution Generalization on Real-world Data

Dissecting the Failure of Invariant Learning on Graphs

Invariant Learning via Probability of Sufficient and Necessary Causes

A Simple Data Augmentation for Graph Classification: A Perspective of Equivariance and Invariance

Subgraph Aggregation for Out-of-Distribution Generalization on Graphs

Graph Out-of-Distribution Generalization via Causal Intervention

IENE: Identifying and Extrapolating the Node Environment for Out-of-Distribution Generalization on Graphs

Individual and Structural Graph Information Bottlenecks for Out-of-Distribution Generalization

GraphDE: A Generative Framework for Debiased Learning and Out-of-Distribution Detection on Graphs

Does Invariant Graph Learning via Environment Augmentation Learn Invariance?

Handling Distribution Shifts on Graphs: An Invariance Perspective

Improving out-of-distribution generalization in graphs via hierarchical semantic environments

Graphs Generalization under Distribution Shifts

Utilizing Edge Features in Graph Neural Networks Via Variational Information Maximization

Empowering Graph Invariance Learning with Deep Spurious Infomax

Graph Structure and Feature Extrapolation for Out-of-Distribution Generalization