Abstract:In graph classification, the out-of-distribution (OOD) issue is attracting great attention. To address this issue, a prevailing idea is to learn stable features, on the assumption that they are substructures causally determining the label and that their relationship with the label is stable to the distributional uncertainty. In contrast, the complementary parts termed environmental features, fail to determine the label solely and hold varying relationships with the label, thus ascribed to the possible reason for the distribution shift. Existing generalization efforts mainly encourage the model's insensitivity to environmental features. While the sensitivity to stable features is promising to distinguish the crucial clues from the distributional uncertainty but largely unexplored. A paradigm of simultaneously exploring the sensitivity to stable features and insensitivity to environmental features is until-now lacking to achieve the generalizable graph classification, to the best of our knowledge. In this work, we conjecture that generalizable models should be sensitive to stable features and insensitive to environmental features. To this end, we propose a simple yet effective augmentation strategy for graph classification: Equivariant and Invariant Cross-Data Augmentation (EI-CDA). By employing equivariance, given a pair of input graphs, we first estimate their stable and environmental features via masks. Then we linearly mix the estimated stable features of two graphs and encourage the model predictions faithfully reflect their mixed semantics. Meanwhile, by using invariance, we swap the estimated environmental features of two graphs and keep the predictions invariant. This simple yet effective strategy endows the models with both sensitivity to stable features and insensitivity to environmental features. Extensive experiments show that EI-CDA significantly improves performance and outperforms leading baselines. Our codes are available at: https://github.com/yongduosui/EI-GNN.

Graph Data Augmentation for Node Classification

NodeAug: Semi-Supervised Node Classification with Data Augmentation

Towards data augmentation in graph neural network: An overview and evaluation

Knowledge Distillation Improves Graph Structure Augmentation for Graph Neural Networks

Efficient Topology-aware Data Augmentation for High-Degree Graph Neural Networks

Data Augmentation for Graph Data: Recent Advancements

Data Augmentation in Graph Neural Networks: The Role of Generated Synthetic Graphs

A Simple Data Augmentation for Graph Classification: A Perspective of Equivariance and Invariance

Robust Optimization as Data Augmentation for Large-scale Graphs

Data Augmentation on Graphs: A Technical Survey

DAGAD: Data Augmentation for Graph Anomaly Detection

Counterfactual Data Augmentation with Denoising Diffusion for Graph Anomaly Detection

Data Augmentation for Deep Graph Learning: A Survey

Rationalizing Graph Neural Networks with Data Augmentation

Improving Graph Convolutional Network with Learnable Edge Weights and Edge-Node Co-Embedding for Graph Anomaly Detection

Local Augmentation for Graph Neural Networks

Self-Enhanced GNN: Improving Graph Neural Networks Using Model Outputs

On the Power of Graph Neural Networks and Feature Augmentation Strategies to Classify Social Networks

Bag of Tricks for Node Classification with Graph Neural Networks

Graph Neural Networks with Precomputed Node Features

GraphAny: A Foundation Model for Node Classification on Any Graph