Improving Generalizability of Graph Anomaly Detection Models Via Data Augmentation
Shuang Zhou,Xiao Huang,Ninghao Liu,Huachi Zhou,Fu-Lai Chung,Long-Kai Huang
DOI: https://doi.org/10.1109/tkde.2023.3271771
IF: 9.235
2023-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Graph anomaly detection (GAD) has wide applications in real-world networked systems. In many scenarios, people need to identify anomalies on new (sub)graphs, but they may lack labels to train an effective detection model. Since recent semi-supervised GAD methods, which can leverage the available labels as prior knowledge, have achieved superior performance than unsupervised methods, one natural idea is to directly adopt a trained semi-supervised GAD model to the new (sub)graphs for testing. However, we find that existing semi-supervised GAD methods suffer from poor generalization issues, i.e., well-trained models could not perform well on an unseen area (i.e., not accessible in training) of the graph. Motivated by this, we formally define the problem of generalized graph anomaly detection that aims to effectively identify anomalies on both the training-domain graph(s) and the unseen test graph(s). Nevertheless, it is a challenging task since only limited labels are available, and the normal data distribution may differ between training and testing data. Accordingly, we propose a data augmentation method named AugAN (Augmentation for Anomaly and Normal distributions) to enrich training data and adopt a customized episodic training strategy for learning with the augmented data. Extensive experiments verify the effectiveness of AugAN in improving model generalizability.
computer science, information systems, artificial intelligence,engineering, electrical & electronic