Improved Graph Structure Clustering Algorithm by Using Parallel Strategy
Yazhong CEHN,Zhenjun LI,Ronghua LI,Rui MAO,Shaojie QIAO
DOI: https://doi.org/10.3778/j.issn.1002-8331.1710-0230
2019-01-01
Abstract:Recently, graph clustering has been attracted much attention in the research community. There are a number of clustering methods, such as modularity optimization algorithm, spectral clustering algorithm, as well as density-based clustering algorithm, that are proved to be useful for graph data. Among those algorithms, the SCAN algorithm is a well-known density-based algorithm for graph data. SCAN is not only able to find clusters, but it also identifies hub nodes and outliers. The SCAN algorithm, however, has two limitations. Firstly, it is very costly to compute the structural similarity for each edge in a massive graph. Secondly, SCAN is sensitive to its parameters ε and μ . To overcome these two limita-tions, this paper proposes an OpenMP-based parallel algorithm with two carefully-designed load-balancing techniques to compute the similarities efficiently, and a novel triangle-based graph structural clustering algorithm, called TSCAN is proposed. The striking feature of the TSCAN algorithm is that it is not sensitive to the parameters, and it is also capable of finding overlapping and dense communities. Finally, this paper conducts extensive experiments over several real-world datasets to evaluate the proposed algorithms. The results indicate that parallel implementation can achieve near-linear speedup, and the TSCAN algorithm is robust to its parameters and can identify overlapping communities.