Clustering-Structure Representative Sampling from Graph Streams.

Jianpeng Zhang,Kaijie Zhu,Yulong Pei,George H. L. Fletcher,Mykola Pechenizkiy
DOI: https://doi.org/10.1007/978-3-319-72150-7_22
2017-01-01
Abstract:Most existing sampling algorithms on graphs (i.e., network-structured data) focus on sampling from memory-resident static graphs and assume the entire graphs are always available. However, the graphs encountered in modern applications are often too large and/or too dynamic to be processed with limited memory. Furthermore, existing sampling techniques are inadequate for preserving the inherent clustering structure, which is an essential property of complex networks. To tackle these problems, we propose a new sampling algorithm that dynamically maintains a representative sample and is capable of retaining clustering structure in graph streams at any time. Performance of the proposed algorithm is evaluated through empirical experiments using real-world networks. The experimental results have shown that our proposed CPIES algorithm can produce clustering-structure representative samples and outperforms current online sampling algorithms.
What problem does this paper attempt to address?