Abstract:Attributed graph clustering, which learns node representation from node attribute and topological graph for clustering, is a fundamental but challenging task for graph analysis. Recently, methods based on graph contrastive learning (GCL) have obtained impressive clustering performance on this task. Yet, we observe that existing GCL-based methods 1) fail to benefit from imprecise clustering labels; 2) require a post-processing operation to get clustering labels; 3) cannot solve out-of-sample (OOS) problem. To address these issues, we propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC). In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, which aims to maximize the similarities of intra-cluster nodes while minimizing the similarities of inter-cluster nodes, are designed for node representation learning. Meanwhile, a clustering module is built to directly output clustering labels by contrasting the representation of different clusters. Thus, for the OOS nodes, SCAGC can directly calculate their clustering labels. Extensive experimental results on four benchmark datasets have shown that SCAGC consistently outperforms 11 competitive clustering methods.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are the three main limitations of existing graph contrastive learning (GCL) - based methods in the attributed graph clustering task: 1. **Unable to benefit from imprecise clustering labels**: Existing GCL methods fail to fully utilize inaccurate clustering labels, resulting in poor performance. 2. **Require post - processing operations to obtain clustering labels**: These methods usually need additional steps to generate the final clustering labels, which may lead to sub - optimal node representations. 3. **Unable to solve the out - of - sample (OOS) problem**: Existing methods cannot directly handle unseen nodes, limiting their application in practical engineering. To solve these problems, the authors propose a new self - supervised contrastive attributed graph clustering network (Self - supervised Contrastive Attributed Graph Clustering, SCAGC). The main improvements of SCAGC include: - **Utilizing imprecise clustering labels**: By designing a self - supervised contrastive loss function, maximize the similarity between nodes within the same cluster and minimize the similarity between nodes in different clusters. - **Directly outputting clustering labels**: Construct a clustering module that directly outputs clustering labels by comparing the representations of different clusters. - **Handling out - of - sample nodes**: For out - of - sample nodes, SCAGC can directly calculate their clustering labels without retraining the entire graph. ### Formula Summary 1. **Node Representation Learning Module**: \[ Z^{(v)} = P(X^{(v)}, G^{(v)}|\Omega_1)=\sigma(\tilde{D}^{-\frac{1}{2}}(v)\tilde{G}(v)\tilde{D}^{-\frac{1}{2}}(v)X^{(v)}\Omega_1) \] \[ Z^{(v)} = P(Z^{(v)}, G^{(v)}|\Omega_2) \] 2. **Self - supervised Contrastive Loss**: \[ L_i = -\frac{1}{|\Delta_i|}\sum_{t\in\Delta_i}\sum_{\alpha,\beta = 1}^2\log\frac{e(\mathcal{S}(m_i^{(\alpha)},m_t^{(\beta)})/\tau_2)}{\sum_{\alpha',\beta' = 1}^2\sum_{q\in\nabla_i}e(\mathcal{S}(m_i^{(\alpha')},m_q^{(\beta')})/\tau_2)} \] \[ L_{SGC}=\min_{\Omega,\phi}\sum_{i = 1}^N L_i \] 3. **Contrastive Clustering Loss**: \[ L(\hat{\ell}_k^{(1)},\hat{\ell}_k^{(2)})=-\log\frac{e(\mathcal{S}(\hat{\ell}_k^{(1)},\hat{\ell}_k^{(2)})/\tau_1)}{\sum_{j = 1}^K e(\mathcal{S}(\hat{\ell}_k^{(1)},\hat{\ell}_j^{(1)})/\tau_1)+\sum_{j = 1}^K e(\mathcal{S}(\hat{\ell}_k^{(1)},\hat{\ell}_j^{(2)})/\tau_1)} \] \[ L_{CC}=\min_{\Omega,\psi}\frac{1}{2K}\sum_{k = 1}^K[L(\hat{\ell}_k^{(1)},\hat{\ell}_k^{(2)})+L(\hat{\ell}

Self-supervised Contrastive Attributed Graph Clustering

Graph Representation Learning via Contrasting Cluster Assignments

Simple Contrastive Graph Clustering

Graph Clustering with High-Order Contrastive Learning

Self-Supervised Contrastive Graph Clustering Network via Structural Information Fusion

Cluster-guided Contrastive Graph Clustering Network

CC-GNN: A Clustering Contrastive Learning Network for Graph Semi-Supervised Learning

Supervised contrastive learning for graph representation enhancement

Dual Contrastive Learning Network for Graph Clustering

ClusterSCL: Cluster-Aware Supervised Contrastive Learning on Graphs

Deep Self-Supervised Attributed Graph Clustering for Social Network Analysis

Reliable Node Similarity Matrix Guided Contrastive Graph Clustering

Unbiased and augmentation-free self-supervised graph representation learning

Self-Supervised Conditional Distribution Learning on Graphs

Enhancing Graph Contrastive Learning with Node Similarity

Graph Soft-Contrastive Learning via Neighborhood Ranking

Graph Self-Contrast Representation Learning

Deep Contrastive Graph Learning with Clustering-Oriented Guidance

Eliciting Structural and Semantic Global Knowledge in Unsupervised Graph Contrastive Learning

Subgraph Networks Based Contrastive Learning

Graph Contrastive Learning with Adaptive Augmentation