Clustering Affiliation Inference from Graph Samples

Jianpeng Zhang,Kaijie Zhu,Yulong Pei,George Fletcher,Mykola Pechenizkiy,g. h. l. fletcher
2018-01-01
Abstract:Graph sampling is a widely-used approach to address the scalability issue when analyzing large-scale graphs. Several promising cluster-preserving sampling algorithms have been proposed. However, once the clustering structure on a sampled graph is obtained, we may still need a method to infer the clustering affiliations of all other nodes in the original graph from the clustered nodes in the sampled subgraph. In this paper, we present a new two-stage clustering inference (TCI ) method to infer clustering affiliations of all nodes in the original graph. TCI is composed of two stages: 1) initialization of clustering affiliations for unsampled nodes based on computed neighborhood affiliation information; 2) label propagation for the whole graph. Our experimental results demonstrate that the proposed TCI method in conjunction with any considered cluster-preserving sampling strategy is capable of inferring the clustering affiliation of the population commendably, and it performs better than the competing methods.
What problem does this paper attempt to address?