Out-Of-Distribution Generalization on Graphs: A Survey

Haoyang Li,Xin Wang,Ziwei Zhang,Wenwu Zhu

DOI: https://doi.org/10.48550/arXiv.2202.07987

2022-12-29

Abstract:Graph machine learning has been extensively studied in both academia and industry. Although booming with a vast number of emerging methods and techniques, most of the literature is built on the in-distribution hypothesis, i.e., testing and training graph data are identically distributed. However, this in-distribution hypothesis can hardly be satisfied in many real-world graph scenarios where the model performance substantially degrades when there exist distribution shifts between testing and training graph data. To solve this critical problem, out-of-distribution (OOD) generalization on graphs, which goes beyond the in-distribution hypothesis, has made great progress and attracted ever-increasing attention from the research community. In this paper, we comprehensively survey OOD generalization on graphs and present a detailed review of recent advances in this area. First, we provide a formal problem definition of OOD generalization on graphs. Second, we categorize existing methods into three classes from conceptually different perspectives, i.e., data, model, and learning strategy, based on their positions in the graph machine learning pipeline, followed by detailed discussions for each category. We also review the theories related to OOD generalization on graphs and introduce the commonly used graph datasets for thorough evaluations. Finally, we share our insights on future research directions. This paper is the first systematic and comprehensive review of OOD generalization on graphs, to the best of our knowledge.

Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to achieve out - of - distribution (OOD) generalization in graph data. Specifically, most of the existing graph machine - learning methods are based on the same - distribution assumption, that is, the data distributions of the training set and the test set are the same. However, in real - world graph scenarios, this assumption is often difficult to meet because there may be distribution shifts between the test data and the training data, resulting in a significant decline in model performance. Therefore, this paper focuses on how to develop methods that also have good generalization ability on data with different distributions, especially in high - risk applications, such as molecular prediction, financial analysis, criminal justice, autonomous driving, particle physics, and epidemic prediction. By comprehensively reviewing the latest progress in OOD generalization, the paper aims to provide a systematic framework to understand and solve this key problem.

Out-Of-Distribution Generalization on Graphs: A Survey

Towards Out-Of-Distribution Generalization: A Survey

Beyond Generalization: A Survey of Out-Of-Distribution Adaptation on Graphs

A Survey of Deep Graph Learning under Distribution Shifts: from Graph Out-of-Distribution Generalization to Adaptation

A Survey on Evaluation of Out-of-Distribution Generalization

Graphs Generalization under Distribution Shifts

Graph Structure and Feature Extrapolation for Out-of-Distribution Generalization

DIVE: Subgraph Disagreement for Graph Out-of-Distribution Generalization

Out-of-Distribution Generalization in Natural Language Processing: Past, Present, and Future

Graph Out-of-Distribution Generalization with Controllable Data Augmentation

Out-of-Distribution Generalization in Text Classification: Past, Present, and Future

A Survey of Out-of-distribution Generalization for Graph Machine Learning from a Causal View

OOD-GNN: Out-of-Distribution Generalized Graph Neural Network

Generalized Out-of-Distribution Detection: A Survey

Investigating Out-of-Distribution Generalization of GNNs: An Architecture Perspective

Towards a Theoretical Framework of Out-of-Distribution Generalization

Invariant Graph Learning Meets Information Bottleneck for Out-of-Distribution Generalization

Improving Graph Out-of-distribution Generalization on Real-world Data

Graph Learning under Distribution Shifts: A Comprehensive Survey on Domain Adaptation, Out-of-distribution, and Continual Learning

Graph Out-of-Distribution Generalization via Causal Intervention