Node Clustering on Attributed Graph Using Anchor Sampling Strategy and Debiasing Strategy

Qian Tang,Yiji Zhao,Hao Wu,Lei Zhang
DOI: https://doi.org/10.1109/tetci.2024.3369849
2024-01-01
IEEE Transactions on Emerging Topics in Computational Intelligence
Abstract:Contrastive representation learning has been widely employed in attributed graph clustering and has demonstrated significant success. However, these methods have two problems: 1)According to an assumption that clusters are formed around a minority of central anchor nodes, the contrastive relationships between these anchors are not explored in previous works. 2)They fail to deal with biased sample pairs, which may degrade the representation quality and cause poor clustering performance. To solve the problems, we propose a framework termed GE-S-D for both node representation learning and clustering, which consists of an anchor sampling strategy, a low-pass graph encoder, and a debiasing strategy. Specifically, to reveal the contrastive relationships between anchors, we design a sampling strategy to select a small number of anchors and then construct a training set of positive and negative sample pairs for contrastive learning. Then, we introduce a low-pass graph encoder to propagate contrastive messages to all nodes and learn cluster-friendly node representations. Furthermore, to alleviate the interference of biased sample pairs, we design a debiasing strategy using K-Means on the node representations to obtain the clustering information and remove the false positive and false negative sample pairs in the training set for improving contrastive learning. The clustering performance is verified on five benchmark datasets, and our method is superior to many state-of-the-art methods according to quantitive and qualitative analysis.
What problem does this paper attempt to address?