Abstract:Graph clustering aiming to partition nodes into several disjoint subsets is a fundamental task for graph-structured learning. Traditional graph clustering methods only consider the adjacency information. In recent years, inspired by the homophily assumption that the adjacent nodes tend to have similar features and labels, most existing graph clustering approaches leverage node attribute information to improve graph clustering performance. These works have mainly focused on node embedding learning via the various combinations of auto-encoder and graph neural networks. As for clustering learning, they introduce a self-optimizing strategy that assumes that all clusters are homogeneous. However, this assumption usually does not hold since the size and variance of different clusters can be quite different, and self-optimizing strategy is incompetent in dealing with this heterogeneous clusters. In this work, we propose a novel method named Adaptive Harmony Learning and Optimization (AHLO) for attributed graph clustering, which models the node embeddings with the mixture of von Mises-Fisher distributions on the unit hypersphere and develops an alternating learning strategy. Specifically, we take the node embeddings as the supervisory signals for the update of the mixture parameters, and the mixture distribution as the supervisory signals for the update of the node embeddings. To prevent small clusters from annexing by large clusters, we develop the regularized harmony loss to enhance the prediction on small clusters. In the mixture parameter optimization stage, we utilize EM algorithm and heuristically design a center update scheme with consideration of the posterior probability confidence and the impact of other centers. Hence, AHLO can simultaneously improve the intra-cluster compactness and inter-cluster separability. Extensive experiments on four benchmark attributed graph datasets have demonstrated the effectiveness of our proposed AHLO.

Clustering validation by distribution hypothesis learning

Distribution free optimality intervals for clustering

Clustering-Based Validation Splits for Model Selection under Domain Shift

Adaptive Graph Fusion Learning for Multi-View Spectral Clustering

Clustering Validation with The Area Under Precision-Recall Curves

Flexible Clustering with a Sparse Mixture of Generalized Hyperbolic Distributions

Learning to Link

Adaptive Harmony Learning and Optimization for Attributed Graph Clustering

Word Clustering with Validity Indices

Extension of the Dip-test Repertoire -- Efficient and Differentiable p-value Calculation for Clustering

Interpretable Clustering with the Distinguishability Criterion

Learning Uniform Clusters on Hypersphere for Deep Graph-level Clustering

CLUSTERING AND CLUSTER VALIDATION IN DATA MINING

Linking Robustness and Generalization: A k* Distribution Analysis of Concept Clustering in Latent Space for Vision Models

A provable initialization and robust clustering method for general mixture models

Clustering with Confidence: Finding Clusters with Statistical Guarantees

Towards understanding hierarchical clustering: A data distribution perspective

Supervised Hierarchical Clustering with Exponential Linkage

Improved Hierarchical Clustering on Massive Datasets with Broad Guarantees

Assessing Method For E-Learner Clustering

Learning to Generate Fair Clusters from Demonstrations