Invariant Graph Learning Meets Information Bottleneck for Out-of-Distribution Generalization

Wenyu Mao,Jiancan Wu,Haoyang Liu,Yongduo Sui,Xiang Wang
2024-08-03
Abstract:Graph out-of-distribution (OOD) generalization remains a major challenge in graph learning since graph neural networks (GNNs) often suffer from severe performance degradation under distribution shifts. Invariant learning, aiming to extract invariant features across varied distributions, has recently emerged as a promising approach for OOD generation. Despite the great success of invariant learning in OOD problems for Euclidean data (i.e., images), the exploration within graph data remains constrained by the complex nature of graphs. Existing studies, such as data augmentation or causal intervention, either suffer from disruptions to invariance during the graph manipulation process or face reliability issues due to a lack of supervised signals for causal parts. In this work, we propose a novel framework, called Invariant Graph Learning based on Information bottleneck theory (InfoIGL), to extract the invariant features of graphs and enhance models' generalization ability to unseen distributions. Specifically, InfoIGL introduces a redundancy filter to compress task-irrelevant information related to environmental factors. Cooperating with our designed multi-level contrastive learning, we maximize the mutual information among graphs of the same class in the downstream classification tasks, preserving invariant features for prediction to a great extent. An appealing feature of InfoIGL is its strong generalization ability without depending on supervised signal of invariance. Experiments on both synthetic and real-world datasets demonstrate that our method achieves state-of-the-art performance under OOD generalization for graph classification tasks. The source code is available at <a class="link-external link-https" href="https://github.com/maowenyu-11/InfoIGL" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the out - of - distribution (OOD) generalization problem in graph data. Specifically, the performance of graph neural networks (Graph Neural Networks, GNNs) often drops significantly when facing distribution shift. To overcome this challenge, the author proposes a new framework - Invariant Graph Learning based on Information Bottleneck Theory (InfoIGL) - to extract invariant features in graph data and enhance the model's generalization ability to unseen distributions. #### Main problems and challenges 1. **Out - of - distribution generalization problem**: - When the distribution of test data is inconsistent with that of training data, the performance of graph neural networks will drop significantly. - This problem is particularly prominent in practical applications because environmental asynchrony may exist during the data collection process, resulting in distribution shift. 2. **Invariant feature extraction**: - Extracting features that remain invariant across different distributions is the key to solving the OOD problem. - Invariant features should exclude spurious features related to environmental factors and remain robust across various distributions. - At the same time, invariant features must contain sufficient information to accurately predict labels. 3. **Limitations of existing methods**: - **Graph operation methods**: Generate diverse data by adding or deleting nodes and edges, but this method may break invariance. - **Causal disentanglement methods**: Use causal intervention theory to extract causal sub - graphs, but due to the lack of supervision signals, it is unreliable to distinguish between causal and non - causal parts. ### Solutions To solve the above problems, the author proposes the InfoIGL framework, and its main contributions are as follows: 1. **Compress redundant information**: - By introducing a redundancy filter, minimize information unrelated to the task, thereby reducing the influence of redundant features. - Use the attention mechanism to assign invariance scores to nodes and edges and remove spurious features. 2. **Maximize mutual information**: - Utilize multi - level contrastive learning to maximize the mutual information between graphs of the same category at the semantic and instance levels. - Optimize the encoder through the contrastive loss function to ensure that the model can capture invariant features. 3. **No need for supervision signals**: - InfoIGL does not need to rely on invariance supervision signals, thereby enhancing the universality and practicality of the framework. 4. **Experimental verification**: - Extensive experiments have been carried out on synthetic datasets and real - world datasets, and the results show that InfoIGL has excellent OOD generalization performance in graph classification tasks. Through these innovations, InfoIGL effectively solves the out - of - distribution generalization problem in graph data and provides a powerful tool for the field of graph learning.