A Two-Stage Co-Adversarial Perturbation to Mitigate Out-of-distribution Generalization of Large-Scale Graph
Yili Wang,Haotian Xue,Xin Wang
DOI: https://doi.org/10.1016/j.eswa.2024.124472
IF: 8.5
2024-01-01
Expert Systems with Applications
Abstract:In the realm of graph out-of-distribution(OOD), despite recent strides in advancing graph neural networks (GNNs) for the modeling of graph data, training GNNs on large-scale datasets presents a formidable hurdle due to the pervasive challenge of overfitting. To address these issues, researchers have explored adversarial training, a technique that enriches training data with worst-case adversarial examples. However, while prior work on adversarial training primarily focuses on safeguarding GNNs against malicious attacks, its potential to enhance the OOD generalization abilities of GNNs in the context of graph analytics remains less explored. In our research, we delve into the inner workings of GNNs by examining the landscapes of weight and feature losses, which respectively illustrate how the loss function changes concerning model weights and node features. Our investigation reveals a noteworthy phenomenon: GNNs are inclined to become trapped in sharp local minima within these loss landscapes, resulting in suboptimal OOD generalization performance. To address this challenge, we introduce the concept of co-adversarial perturbation optimization, which considers both model weights and node features, and we design an alternating adversarial perturbation algorithm for graph out-of-distribution generalization. This algorithm operates iteratively, smoothing the weight and feature loss landscapes alternately. Moreover, our training process unfolds in two distinct stages. The first stage centers on standard cross-entropy minimization, ensuring rapid convergence of GNN models. In the second stage, we employ our alternating adversarial training strategy to prevent the models from becoming ensnared in locally sharp minima. Our extensive experiments provide compelling evidence that our CAP approach can generally enhance the OOD generalization performance of GNNs across a diverse range of large-scale graphs.