Self-Supervised Graph Embedding Clustering

Fangfang Li,Quanxue Gao,Cheng Deng,Wei Xia

2024-10-30

Abstract:The K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. However, it combines the K-means clustering and dimensionality reduction processes for optimization, leading to limitations in the clustering effect due to the introduced hyperparameters and the initialization of clustering centers. Moreover, maintaining class balance during clustering remains challenging. To overcome these issues, we propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework. Specifically, we establish a connection between K-means and the manifold structure, allowing us to perform K-means without explicitly defining centroids. Additionally, we use this centroid-free K-means to generate labels in low-dimensional space and subsequently utilize the label information to determine the similarity between samples. This approach ensures consistency between the manifold structure and the labels. Our model effectively achieves one-step clustering without the need for redundant balancing hyperparameters. Notably, we have discovered that maximizing the $\ell_{2,1}$-norm naturally maintains class balance during clustering, a result that we have theoretically proven. Finally, experiments on multiple datasets demonstrate that the clustering results of Our-LPP and Our-MFA exhibit excellent and reliable performance.

Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the "curse of dimensionality" problem faced when clustering in high - dimensional spaces. Specifically, when traditional clustering algorithms process high - dimensional data, the distances between samples become sparse, making it difficult to identify similarities. In addition, noise and redundant information in high - dimensional data further complicate the clustering process and affect the accuracy of clustering results. To address these problems, the paper proposes a unified framework that combines manifold learning and K - means clustering to form a self - supervised graph embedding framework. The main features of this framework include: 1. **K - means without centroids**: By establishing a connection between K - means and the manifold structure, K - means clustering can be performed without explicitly defining the clustering centers. 2. **Label generation in low - dimensional space**: Use K - means without centroids to generate labels in low - dimensional space, and use this label information to determine the similarity between samples, thereby ensuring the consistency between the manifold structure and the labels. 3. **Natural class balance**: By maximizing the ℓ2,1 norm, the class balance in the clustering process is naturally maintained. This has been verified through theoretical proof. Through experiments on multiple datasets, the paper demonstrates the excellence and reliability of the proposed Our - LPP and Our - MFA methods in clustering effects.

Self-Supervised Graph Embedding Clustering

Unsupervised Dimensionality Reduction Based on Fusing Multiple Clustering Results

Self-Supervised Clustering based on Manifold Learning and Graph Convolutional Networks

Discriminative Unsupervised Dimensionality Reduction.

Fast Self-Supervised Clustering With Anchor Graph

Unsupervised Multi-Manifold Clustering by Learning Deep Representation

Unsupervised Large Graph Embedding Based on Balanced and Hierarchical K-Means.

Structured Optimal Graph-Based Clustering with Flexible Embedding.

Discriminative Unsupervised 2D Dimensionality Reduction with Graph Embedding

Locality Sensitive Discriminative Unsupervised Dimensionality Reduction

Unsupervised Dimension Reduction Using Supervised Orthogonal Discriminant Projection for Clustering.

Unsupervised Deep Embedding for Fuzzy Clustering

Graph Embedding Clustering: Graph Attention Auto-Encoder with Cluster-Specificity Distribution.

Unsupervised Manifold Linearizing and Clustering

Self-supervised graph clustering via attention auto-encoder with distribution specificity

Adaptive Flexible Optimal Graph for Unsupervised Dimensionality Reduction

Unsupervised Feature Extraction Using a Learned Graph with Clustering Structure

Multi-view Projected Clustering with Graph Learning.

Semi-Supervised Clustering via Dynamic Graph Structure Learning

Deep Self-Supervised Attributed Graph Clustering for Social Network Analysis

Self-supervised Graph Convolutional Clustering by Preserving Latent Distribution