Masked AutoEncoder for Graph Clustering without Pre-defined Cluster Number k

Yuanchi Ma,Hui He,Zhongxiang Lei,Zhendong Niu

2024-01-09

Abstract:Graph clustering algorithms with autoencoder structures have recently gained popularity due to their efficient performance and low training cost. However, for existing graph autoencoder clustering algorithms based on GCN or GAT, not only do they lack good generalization ability, but also the number of clusters clustered by such autoencoder models is difficult to determine automatically. To solve this problem, we propose a new framework called Graph Clustering with Masked Autoencoders (GCMA). It employs our designed fusion autoencoder based on the graph masking method for the fusion coding of graph. It introduces our improved density-based clustering algorithm as a second decoder while decoding with multi-target reconstruction. By decoding the mask embedding, our model can capture more generalized and comprehensive knowledge. The number of clusters and clustering results can be output end-to-end while improving the generalization ability. As a nonparametric class method, extensive experiments demonstrate the superiority of \textit{GCMA} over state-of-the-art baselines.

Machine Learning

What problem does this paper attempt to address?

The paper primarily aims to address the following issues: 1. **Automatically determining the number of clusters**: Existing clustering algorithms based on Graph Autoencoder (GAE) usually require the number of clusters \(k\) to be predefined, which is often unknown in practical applications. Therefore, the researchers propose a method to automatically determine the optimal number of clusters. 2. **Improving model generalization ability**: Current graph autoencoder clustering algorithms based on Graph Convolutional Network (GCN) or Graph Attention Network (GAT) lack good generalization ability. This means they may not handle unseen data well. 3. **Enhancing the quality of graph embeddings**: Graph autoencoders based on simple graph reconstruction principles may overly emphasize neighboring information, which is not always beneficial for self-supervised learning. Therefore, the researchers designed better pre-training tasks to improve the quality of the learned graph embeddings. To address the above issues, the researchers proposed a new framework named GraphClustering with Masked Autoencoders (GCMA). This framework combines graph masked autoencoders with an improved density-based clustering algorithm, enabling graph data clustering without the need to predefine the number of clusters, and improving the model's generalization ability and interpretability. Experimental results show that GCMA outperforms existing baseline methods on multiple datasets.

Masked AutoEncoder for Graph Clustering without Pre-defined Cluster Number k

Preserving Global Information for Graph Clustering with Masked Autoencoders

Deep Masked Graph Node Clustering

Embedding Graph Auto-Encoder for Graph Clustering

Towards Faster Deep Graph Clustering via Efficient Graph Auto-Encoder

UGMAE: A Unified Framework for Graph Masked Autoencoders

Masked Graph Modeling with Multi- View Contrast

GFMAE: Self-Supervised GNN-Free Masked Autoencoders

Hi-GMAE: Hierarchical Graph Masked Autoencoders

Adaptive Graph Auto-Encoder for General Data Clustering

Masked Dual Graph Autoencoder for Attributed Graph Community Detection

GRAE - Graph Recurrent Autoencoder for Multi-view Graph Clustering.

ProtoMGAE: Prototype-aware Masked Graph Auto-Encoder for Graph Representation Learning

Dynamic Graph Attention-Guided Graph Clustering with Entropy Minimization Self-Supervision

Attributed Graph Clustering Network with Adaptive Feature Fusion

CaEGCN: Cross-Attention Fusion based Enhanced Graph Convolutional Network for Clustering

Adaptive Graph Convolutional Clustering Network with Optimal Probabilistic Graph

RARE: Robust Masked Graph Autoencoder

What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders

Heterogeneous Graph Masked Autoencoders

Where to Mask: Structure-Guided Masking for Graph Masked Autoencoders