Index-free triangle-based graph local clustering

Zhe Yuan,Zhewei Wei,Fangrui Lv,Ji-Rong Wen
DOI: https://doi.org/10.1007/s11704-023-2768-7
IF: 2.6688
2023-12-14
Frontiers of Computer Science
Abstract:Motif-based graph local clustering (MGLC) is a popular method for graph mining tasks due to its various applications. However, the traditional two-phase approach of precomputing motif weights before performing local clustering loses locality and is impractical for large graphs. While some attempts have been made to address the efficiency bottleneck, there is still no applicable algorithm for large scale graphs with billions of edges. In this paper, we propose a purely local and index-free method called Index-free Triangle-based Graph Local Clustering (TGLC*) to solve the MGLC problem w.r.t. a triangle. TGLC* directly estimates the Personalized PageRank (PPR) vector using random walks with the desired triangle-weighted distribution and proposes the clustering result using a standard sweep procedure. We demonstrate TGLC*'s scalability through theoretical analysis and its practical benefits through a novel visualization layout. TGLC* is the first algorithm to solve the MGLC problem without precomputing the motif weight. Extensive experiments on seven real-world large-scale datasets show that TGLC* is applicable and scalable for large graphs.
computer science, information systems, theory & methods, software engineering
What problem does this paper attempt to address?