Bring Order into the Samples: A Novel Scalable Method for Influence Maximization

Xiaoyang Wang,Ying Zhang,Wenjie Zhang,Xuemin Lin,Chen
DOI: https://doi.org/10.1109/tkde.2016.2624734
IF: 9.235
2017-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:As a key problem in viral marketing, influence maximization has been extensively studied in the literature. Given a positive integer $k$ , a social network $\mathcal {G}$ and a certain propagation model, it aims to find a set of $k$ nodes that have the largest influence spread. The state-of-the-art method IMM is based on the reverse influence sampling (RIS) framework. By using the martingale technique, it greatly outperforms the previous methods in efficiency. However, IMM still has limitations in scalability due to the high overhead of deciding a tight sample size. In this paper, instead of spending the effort on deciding a tight sample size, we present a novel bottom-k sketch based RIS framework, namely BKRIS, which brings the order of samples into the RIS framework. By applying the sketch technique, we can derive early termination conditions to significantly accelerate the seed set selection procedure. Moreover, we provide a cost-effective method to find a proper sample size to bound the quality of returned result. In addition, we provide several optimization techniques to reduce the cost of generating samples’ order and efficiently deal with the worst-case scenario. We demonstrate the efficiency and effectiveness of the proposed method over 10 real world datasets. Compared with the IMM approach, BKRIS can achieve up to two orders of magnitude speedup with almost the same influence spread. In the largest dataset with 1.8 billion edges, BKRIS can return 50 seeds in 1.3 seconds and return 5,000 seeds in 36.6 seconds. It takes IMM 55.32 second and 3,664.97 seconds, respectively.
What problem does this paper attempt to address?