Link prediction for ex ante influence maximization on temporal networks

Eric Yanchenko,Tsuyoshi Murata,Petter Holme
2023-09-12
Abstract:Influence maximization (IM) is the task of finding the most important nodes in order to maximize the spread of influence or information on a network. This task is typically studied on static or temporal networks where the complete topology of the graph is known. In practice, however, the seed nodes must be selected before observing the future evolution of the network. In this work, we consider this realistic ex ante setting where $p$ time steps of the network have been observed before selecting the seed nodes. Then the influence is calculated after the network continues to evolve for a total of $T>p$ time steps. We address this problem by using statistical, non-negative matrix factorization and graph neural networks link prediction algorithms to predict the future evolution of the network and then apply existing influence maximization algorithms on the predicted networks. Additionally, the output of the link prediction methods can be used to construct novel IM algorithms. We apply the proposed methods to eight real-world and synthetic networks to compare their performance using the Susceptible-Infected (SI) diffusion model. We demonstrate that it is possible to construct quality seed sets in the ex ante setting as we achieve influence spread within 87\% of the optimal spread on seven of eight network. In many settings, choosing seed nodes based only historical edges provides results comparable to the results treating the future graph snapshots as known. The proposed heuristics based on the link prediction model are also some of the best-performing methods. These findings indicate that, for these eight networks under the SI model, the latent process which determines the most influential nodes may not have large temporal variation. Thus, knowing the future status of the network is not necessary to obtain good results for ex ante IM.
Social and Information Networks
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the problem of **ex ante influence maximization in temporal networks**. Specifically: 1. **Problem Background**: - Traditional influence maximization (IM) problems usually assume that the network topology is known, i.e., an ex post assumption. - In practice, seed nodes need to be selected before observing the future network evolution, which is a more realistic but also more challenging scenario. 2. **Specific Problem**: - Given a temporal network G1, ..., Gp that has been observed for a partial number of time steps (the first p steps), how to select seed nodes S at time t=p such that information spread is maximized after the network continues to evolve for T-p steps (i.e., at time t=T+1)? - In this scenario, the future network topology is unknown, so it is necessary to predict the future network evolution to assist in selecting seed nodes. 3. **Solution**: - Use link prediction algorithms such as statistical methods, non-negative matrix factorization (NMF), and graph neural networks (GNN) to predict the future network topology. - Apply existing influence maximization algorithms on the predicted network to select seed nodes. - Propose a novel IM heuristic algorithm based on the output of the link prediction model to directly determine important nodes from the link prediction results. Through this approach, the paper demonstrates the possibility of constructing high-quality seed sets under ex ante settings in eight real-world and synthetic networks, with results close to 87% of the optimal solution. These findings suggest that in some networks, the most influential nodes may not exhibit significant temporal variability, thus achieving good results without fully understanding the future network state.