Dissecting the Failure of Invariant Learning on Graphs

Qixun Wang,Yifei Wang,Yisen Wang,Xianghua Ying
2024-11-05
Abstract:Enhancing node-level Out-Of-Distribution (OOD) generalization on graphs remains a crucial area of research. In this paper, we develop a Structural Causal Model (SCM) to theoretically dissect the performance of two prominent invariant learning methods -- Invariant Risk Minimization (IRM) and Variance-Risk Extrapolation (VREx) -- in node-level OOD settings. Our analysis reveals a critical limitation: due to the lack of class-conditional invariance constraints, these methods may struggle to accurately identify the structure of the predictive invariant ego-graph and consequently rely on spurious features. To address this, we propose Cross-environment Intra-class Alignment (CIA), which explicitly eliminates spurious features by aligning cross-environment representations conditioned on the same class, bypassing the need for explicit knowledge of the causal pattern structure. To adapt CIA to node-level OOD scenarios where environment labels are hard to obtain, we further propose CIA-LRA (Localized Reweighting Alignment) that leverages the distribution of neighboring labels to selectively align node representations, effectively distinguishing and preserving invariant features while removing spurious ones, all without relying on environment labels. We theoretically prove CIA-LRA's effectiveness by deriving an OOD generalization error bound based on PAC-Bayesian analysis. Experiments on graph OOD benchmarks validate the superiority of CIA and CIA-LRA, marking a significant advancement in node-level OOD generalization. The codes are available at <a class="link-external link-https" href="https://github.com/NOVAglow646/NeurIPS24-Invariant-Learning-on-Graphs" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the out - of - distribution (OOD) generalization problem at the node level in graph data. Specifically, by constructing a structural causal model (SCM), the author analyzes the reasons for the poor performance of two common invariant learning methods, invariant risk minimization (IRM) and variance - risk extrapolation (VREx), on graph data. #### Main problems: 1. **Lack of class - conditional invariance constraints**: Existing invariant learning methods such as IRM and VREx may be unable to accurately identify the structure of the predictive invariant ego - graph due to the lack of class - conditional invariance constraints, and thus rely on spurious features. 2. **Difficulty in obtaining environment labels**: In node - level OOD tasks, environment labels are usually difficult to obtain, which makes invariant learning methods based on environment partitioning infeasible. #### Solutions: To solve these problems, the author proposes the following methods: 1. **Cross - environment intra - class alignment (CIA)**: - **Principle**: CIA explicitly eliminates spurious features by aligning the node representations of the same category in different environments. This is because samples of the same category but in different environments share similar causal patterns but have different spurious features. - **Effect**: CIA can effectively learn invariant representations without explicit knowledge of the causal pattern structure. 2. **Localized reweighting alignment (CIA - LRA)**: - **Background**: To adapt to the node - level OOD scenario without environment labels, CIA - LRA uses the neighborhood label distribution to selectively align node representations, effectively distinguishing and retaining invariant features while removing spurious features. - **Implementation**: CIA - LRA avoids the problem of invariant feature collapse caused by over - alignment by introducing local alignment and reweighting alignment strategies. #### Theoretical and experimental verification: - **Theoretical proof**: The author derives the effectiveness of CIA - LRA through PAC - Bayesian analysis and obtains the OOD generalization error bound. - **Experimental results**: Experiments on multiple graph OOD benchmark datasets verify the superiority of CIA and CIA - LRA, especially achieving significant performance improvements on both real - world and synthetic datasets. ### Summary This paper proposes a new framework CIA and its improved version CIA - LRA to address the OOD generalization challenges at the node level by in - depth analysis of the limitations of existing invariant learning methods on graph data. These methods are not only strictly proven theoretically but also show excellent performance in practical applications.