SPORT: A Subgraph Perspective on Graph Classification with Label Noise

Nan Yin,Li Shen,Chong Chen,Xian-Sheng Hua,Xiao Luo
DOI: https://doi.org/10.1145/3687468
IF: 4.157
2024-01-01
ACM Transactions on Knowledge Discovery from Data
Abstract:Graph neural networks (GNNs) have achieved great success recently on graph classification tasks using supervised end-to-end training. Unfortunately, extensive noisy graph labels could exist in the real world because of the complicated processes of manual graph data annotations, which may significantly degrade the performance of GNNs. Therefore, we investigate the problem of graph classification with label noise, which is demanding because of the complex graph representation learning issue and serious memorization of noisy samples. In this work, we present a novel approach called Subgraph Set Network with Sample Selection and Consistency Learning (SPORT) for this problem. To release the overfitting of GNNs, SPORT proposes to characterize each graph as a set of subgraphs generated by certain predefined stratagems, which can be viewed as samples from its underlying semantic distribution in graph space. Then we develop an equivariant network to encode the subgraph set with the consideration of the symmetry group. To further release the influences of noisy examples, we leverage the predictions of subgraphs to measure the likelihood of a sample being clean or noisy, followed by effective label updating. In addition, we propose a joint loss to advance the model generalizability by introducing consistency regularization. Comprehensive experiments on a wide range of graph classification datasets demonstrate the effectiveness of our SPORT. Specifically, SPORT outperforms the most competing baseline by up to 6.4%.
What problem does this paper attempt to address?