Yuting Feng,Vincent Y. F. Tan,Bogdan Cautis
Abstract:We consider a ubiquitous scenario in the study of Influence Maximization (IM), in which there is limited knowledge about the topology of the diffusion network. We set the IM problem in a multi-round diffusion campaign, aiming to maximize the number of distinct users that are influenced. Leveraging the capability of bandit algorithms to effectively balance the objectives of exploration and exploitation, as well as the expressivity of neural networks, our study explores the application of neural bandit algorithms to the IM problem. We propose the framework IM-GNB (Influence Maximization with Graph Neural Bandits), where we provide an estimate of the users' probabilities of being influenced by influencers (also known as diffusion seeds). This initial estimate forms the basis for constructing both an exploitation graph and an exploration one. Subsequently, IM-GNB handles the exploration-exploitation tradeoff, by selecting seed nodes in real-time using Graph Convolutional Networks (GCN), in which the pre-estimated graphs are employed to refine the influencers' estimated rewards in each contextual setting. Through extensive experiments on two large real-world datasets, we demonstrate the effectiveness of IM-GNB compared with other baseline methods, significantly improving the spread outcome of such diffusion campaigns, when the underlying network is unknown.
Machine Learning,Artificial Intelligence,Information Retrieval,Social and Information Networks
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to maximize the influence range of information dissemination when the topological structure of the information diffusion network is unknown. Specifically, the paper focuses on how to select the most influential nodes (i.e., seed nodes or influencers) in multi - round diffusion activities to maximize the number of activated independent users.
### Problem Background
1. **Information Diffusion Model**:
- In social networks, information diffusion is usually described by stochastic diffusion models (such as the Independent Cascade model IC and the Linear Threshold model LT).
- Selecting seed nodes that maximize the expected spread is an NP - hard problem, especially under common diffusion models.
2. **Challenges**:
- It is very difficult to obtain meaningful influence probabilities because learning these probabilities from past information cascades may require a large amount of data and is not always feasible.
- In the absence of historical cascade data, traditional methods based on pre - defined diffusion models are difficult to apply.
3. **Limitations of Existing Methods**:
- Even the most effective IM algorithms rely on assumptions and parameters, which often fail to capture the complex reality of online information dissemination.
- Many methods rely on the known diffusion graph structure, but in actual scenarios, this structure is often unknown.
### Solution
The paper proposes a new framework, IM - GNB (Influence Maximization with Graph Neural Bandits), which uses the graph neural bandit algorithm to solve the above problems. The main contributions of IM - GNB include:
1. **Combining Contextual Multi - Armed Bandits (CMABs)**:
- Through the CMABs framework, the paper introduces contextual information (such as the characteristics of influencers and the information to be diffused) to better adapt to different diffusion situations.
2. **Balance between Exploration and Exploitation**:
- Utilize the capabilities of the bandit algorithm to achieve a balance between exploration (exploring unknown diffusion dynamics) and exploitation (exploiting known successful selections).
3. **Construction of User - User Correlation Graphs**:
- Construct user - user correlation graphs for exploration and exploitation purposes, capturing the complex interaction relationships between users and influencers.
- These graphs can be extended to various network settings and can work effectively even when the network topology is unknown.
4. **Real - Time Selection of Seed Nodes**:
- Develop a novel algorithm that combines the contextual bandit algorithm with Graph Neural Networks (GNN) to select the optimal seed nodes in real - time and refine the reward estimation in each context environment.
### Formula Summary
- **Diffusion Probability Estimation**:
\[
w_{i,t}(u, u')=\Phi^{(1)}\left(\mathbb{E}[p_{i,t,u} | k_i, C_t], \mathbb{E}[p_{i,t,u'} | k_i, C_t]\right)
\]
where \( p_{i,t,u}=h_u(k_i, C_t)\in[0, 1] \) is the expected diffusion probability of influencer \( k_i \) to user \( u \) under context \( C_t \).
- **Exploration Graph Weights**:
\[
w^{(2)}_{i,t}(u, u')=\Phi^{(2)}\left(h^{(2)}_u(\nabla h^{(1)}_u), h^{(2)}_{u'}(\nabla h^{(1)}_{u'})\right)
\]
- **Reward Estimation**:
\[
\hat{r}_{i,t}=f^{(1)}(k_i, C_t, G^{(1)}_{i,t})
\]
- **Potential Gain Estimation**:
\[
\hat{b}_{i,t}=f^{(2)}(k_i, C_t, G^{(2)}_{i,t})
\]
Through these formulas and methods, the IM - GNB framework can optimize the effectiveness of information diffusion activities in uncertain environments and significantly improve the diffusion results, especially in...