Information Cascade Prediction of complex networks based on Physics-informed Graph Convolutional Network

Dingguo Yu,Yijie Zhou,Suiyu Zhang,Wenbing Li,Michael Small,Ke-ke Shang,Keke SHANG
DOI: https://doi.org/10.1088/1367-2630/ad1b29
2024-01-06
New Journal of Physics
Abstract:Cascade prediction aims to estimate the popularity of information diffusion in complex networks, which is beneficial to many applications from identifying viral marketing to fake news propagation in social media, estimating the scientific impact (citations) of a new publication, and so on. How to effectively predict cascade growth size has become a significant problem. Most previous methods based on deep learning have achieved remarkable results, while concentrating on mining structural and temporal features from diffusion networks and propagation paths. Whereas, the ignorance of spread dynamic information restricts the improvement of prediction performance. In this paper, we propose a novel framework called Physics-informed graph convolutional network (PiGCN) for cascade prediction, which combines explicit features (structural and temporal features) and propagation dynamic status in learning diffusion ability of cascades. Specifically, PiGCN is an end-to-end predictor, firstly splitting a given cascade into sub-cascade graph sequence and learning local structures of each sub-cascade via graph convolutional network (GCN), then adopting multi-layer perceptron (MLP) to predict the cascade growth size. Moreover, our dynamic neural network, combining PDE-like equations and a deep learning method, is designed to extract potential dynamics of cascade diffusion, which captures dynamic evolution rate both on structural and temporal changes. To evaluate the performance of our proposed PiGCN model, we have conducted extensive experiment on two well-known large-scale datasets from Sina Weibo and ArXIv subject listing HEP-PH to verify the effectiveness of our model. The results of our proposed model outperform the mainstream model, and show that dynamic features have great significance for cascade size prediction.
physics, multidisciplinary
What problem does this paper attempt to address?
The paper aims to address the problem of information cascade prediction, specifically estimating the popularity of information diffusion in complex networks. Specifically, the research goal is to predict the scale of information dissemination in scenarios such as social media (e.g., Weibo, Twitter) and academic networks (e.g., ArXiv). For example, applications include identifying viral marketing, fake news dissemination, and evaluating the scientific impact of new publications. The paper proposes a new framework—Physics-informed Graph Convolutional Network (PiGCN) for information cascade prediction. This framework combines explicit features (structural features and temporal features) with propagation dynamic states to learn the information diffusion capability. Specifically, PiGCN first splits the given information cascade into a sequence of sub-cascade graphs and learns the local structure of each sub-cascade through a Graph Convolutional Network (GCN); then, a Multi-Layer Perceptron (MLP) is used to predict the cascade growth scale. Additionally, this dynamic neural network integrates Partial Differential Equation (PDE) methods and deep learning techniques to extract latent dynamic features in the information diffusion process. In summary, the main contributions of the paper are: 1. Simplified feature processing: Introduces a new method to utilize explicit features such as structural features, temporal features, and user influence features, avoiding the challenges of processing text and image features. 2. Method for extracting dynamic features: Proposes a method that views information diffusion as a dynamic system, using a dynamic time and space-dependent PDE-like network to compute the temporal and spatial derivatives of the propagation process, thereby extracting implicit change information. 3. Introduction of a physics-informed framework: Innovatively embeds available but incomplete physical information knowledge (scientific principles) into the popularity prediction network by embedding physical constraints into the loss function to capture the dynamic information of information diffusion.