Abstract:Attributed networks containing entity-specific information in node attributes are ubiquitous in modeling social networks, e-commerce, bioinformatics, etc. Their inherent network topology ranges from simple graphs to hypergraphs with high-order interactions and multiplex graphs with separate layers. An important graph mining task is node clustering, aiming to partition the nodes of an attributed network into k disjoint clusters such that intra-cluster nodes are closely connected and share similar attributes, while inter-cluster nodes are far apart and dissimilar. It is highly challenging to capture multi-hop connections via nodes or attributes for effective clustering on multiple types of attributed networks. In this paper, we first present AHCKA as an efficient approach to attributed hypergraph clustering (AHC). AHCKA includes a carefully-crafted K-nearest neighbor augmentation strategy for the optimized exploitation of attribute information on hypergraphs, a joint hypergraph random walk model to devise an effective AHC objective, and an efficient solver with speedup techniques for the objective optimization. The proposed techniques are extensible to various types of attributed networks, and thus, we develop ANCKA as a versatile attributed network clustering framework, capable of attributed graph clustering (AGC), attributed multiplex graph clustering (AMGC), and AHC. Moreover, we devise ANCKA with algorithmic designs tailored for GPU acceleration to boost efficiency. We have conducted extensive experiments to compare our methods with 19 competitors on 8 attributed hypergraphs, 16 competitors on 6 attributed graphs, and 16 competitors on 3 attributed multiplex graphs, all demonstrating the superb clustering quality and efficiency of our methods.

What problem does this paper attempt to address?

The paper attempts to address the problem of efficient and high-quality node clustering in networks with node attributes (such as social networks, e-commerce, bioinformatics, etc.). Specifically, the paper focuses on how to achieve effective node clustering in various types of attribute networks (including attributed graphs, attributed hypergraphs, and attributed multigraphs) so that nodes within the same cluster are tightly connected in the network topology and have similar attributes, while nodes between different clusters are far apart and have significant attribute differences. ### Main Challenges 1. **Capturing Multi-hop Relationships**: Effectively capturing multi-hop connections (through nodes or attributes) in attribute networks is a significant challenge, especially when dealing with various types of attribute networks. 2. **Integration of Heterogeneous Information**: Nodes, hyperedge connections, and attributes are heterogeneous objects, and their information cannot be simply and directly integrated. 3. **Computational Complexity**: For large attributed hypergraphs, existing methods either have unsatisfactory result quality or are too computationally expensive to be practically applied. ### Solution The paper proposes a general framework named ANCKA, which can efficiently handle the clustering tasks of attribute networks (including attributed hypergraph clustering, attributed graph clustering, and attributed multigraph clustering). The main innovations of ANCKA include: 1. **K-Nearest Neighbor Enhancement Strategy**: Enhancing the original hypergraph structure by constructing a K-nearest neighbor graph, utilizing the attribute similarity between nodes to add additional connections. This strategy effectively leverages attribute information to improve clustering quality. 2. **Joint Random Walk Model**: Designing an optimization objective based on a joint random walk model, which seamlessly combines higher-order relationships (from hypergraphs and K-nearest neighbor graphs) to better capture multi-hop relationships between nodes. 3. **Efficient Optimization Techniques**: Through theoretical analysis, transforming the original NP-hard problem into an approximate matrix trace optimization problem, and iteratively searching for high-quality solutions using efficient matrix operations. 4. **GPU Acceleration**: Developing ANCKA-GPU to further improve efficiency through GPU acceleration, especially on large-scale datasets. ### Experimental Validation The paper validates the superior performance of ANCKA and its GPU version in terms of clustering quality and efficiency through extensive experiments. The experimental results show that ANCKA significantly outperforms existing methods on multiple benchmark datasets, and ANCKA-GPU can significantly reduce computation time, especially on large-scale datasets. ### Conclusion The paper proposes a general and efficient attribute network clustering framework ANCKA, capable of handling various types of attribute networks, including attributed hypergraphs, attributed graphs, and attributed multigraphs. Through the K-nearest neighbor enhancement strategy and joint random walk model, ANCKA achieves efficient computation while maintaining high clustering quality. Additionally, ANCKA-GPU further enhances the capability of handling large-scale datasets through GPU acceleration.

A Versatile Framework for Attributed Network Clustering via K-Nearest Neighbor Augmentation

CoHomo: A Cluster-Attribute Correlation Aware Graph Clustering Framework

Attributed Graph Clustering Network with Adaptive Feature Fusion

Attributed Multiplex Graph Clustering: A Heuristic Clustering-Aware Network Embedding Approach

Adaptive Harmony Learning and Optimization for Attributed Graph Clustering

Adaptive Graph Convolution Using Heat Kernel for Attributed Graph Clustering

Towards attributed graph clustering using enhanced graph and reconstructed graph structure

Adaptive Graph Convolution Methods for Attributed Graph Clustering

Effective Clustering on Large Attributed Bipartite Graphs

Attribute-Missing Graph Clustering Network

Clustering in Networks with Multi-Modality Attributes

Clustering Large Attributed Information Networks: an Efficient Incremental Computing Approach.

Attributed Graph Clustering via Adaptive Graph Convolution

Scalable and Adaptive Spectral Embedding for Attributed Graph Clustering

Cross Multi-Type Objects Clustering in Attributed Heterogeneous Information Network

A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix

Attributed Graph Clustering Algorithm Based on Cluster-aware Multiagent System

GRACE: A General Graph Convolution Framework for Attributed Graph Clustering

A Novel Approach to Attributed Graph Overlapping Clustering

Auxiliary Graph for Attribute Graph Clustering

Detecting Anomalies in Attributed Networks Through Sparse Canonical Correlation Analysis Combined With Random Masking and Padding