Unbiased Sampling of Social Media Networks for Well-connected Subgraphs.

Dong Wang,Zhenyu Li,Gareth Tyson,Zhenhua Li,Gaogang Xie
DOI: https://doi.org/10.1145/3110025.3110141
2017-01-01
Abstract:Sampling social graphs is critical for studying things like information diffusion. However, it is often necessary to laboriously obtain unbiased and well-connected datasets because existing survey algorithms are unable to generate well-connected samples, and current random-walk based unbiased sampling algorithms adopt rejection sampling, which heavily undermines performance. This paper proposes a novel random-walk based algorithm which implements Unbiased Sampling using Dummy Edges (USDE). It injects dummy edges between nodes, on which the walkers would otherwise experience excessive rejections before moving out from such nodes. We propose a rejection probability estimation algorithm to facilitate the construction of dummy edges and the computation of moving probabilities. Finally, we apply USDE in two real-life social media: Twitter and Sina Weibo. The results demonstrate that USDE generates well-connected samples, and outperforms existing approaches in terms of sampling efficiency and quality of samples.
What problem does this paper attempt to address?