A Dynamic Programming Framework for Large-Scale Online Clustering on Graphs

Li Yantao,Zhao Xiang,Qu Zehui
DOI: https://doi.org/10.1007/s11063-020-10329-1
IF: 2.565
2020-01-01
Neural Processing Letters
Abstract:As a fundamental technique for data analysis, graph clustering grouping graph data into clusters has attracted great attentions in recent years. In this paper, we presentDPOCG, a dynamic programming framework for large-scale online clustering on graphs, which improves the scalability of a wide range of graph clustering algorithms. Specifically,DPOCGfirst identifies the nodes whose states are unchanged compared with the states at the previous time on a large-scale graph, then constructs these unchanged nodes as supernodes, which greatly reduces the size of the graph at the current time, and collapses nodes whose degrees are less than a predefined threshold. Based on our density-based graph clustering algorithm (DGCM),DPOCGpartitions the reduced graph into clusters. In addition, we theoretically analyzeDPOCGin terms of supernode generation, clustering on reduced graph, and computational complexity. We evaluateDPOCGon a synthetic dataset and seven real-world datasets, respectively, and the experimental results show thatDPOCGconsumes less running time and improves the efficiency of clustering.
What problem does this paper attempt to address?