Domain-Informed Negative Sampling Strategies for Dynamic Graph Embedding in Meme Stock-Related Social Networks

Yunming Hui,Inez Maria Zwetsloot,Simon Trimborn,Stevan Rudinac
2024-11-01
Abstract:Social network platforms like Reddit are increasingly impacting real-world economics. Meme stocks are a recent phenomena where price movements are driven by retail investors organising themselves via social networks. To study the impact of social networks on meme stocks, the first step is to analyse these networks. Going forward, predicting meme stocks' returns would require to predict dynamic interactions first. This is different from conventional link prediction, frequently applied in e.g. recommendation systems. For this task, it is essential to predict more complex interaction dynamics, such as the exact timing and interaction types like loops. These are crucial for linking the network to meme stock price movements. Dynamic graph embedding (DGE) has recently emerged as a promising approach for modeling dynamic graph-structured data. However, current negative sampling strategies, an important component of DGE, are designed for conventional dynamic link prediction and do not capture the specific patterns present in meme stock-related social networks. This limits the training and evaluation of DGE models in analysing such social networks. To overcome this drawback, we propose novel negative sampling strategies based on the analysis of real meme stock-related social networks and financial knowledge. Our experiments show that the proposed negative sampling strategy can better evaluate and train DGE models targeted at meme stock-related social networks compared to existing baselines.
Social and Information Networks
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in social networks related to meme stocks, the negative sampling strategies of existing Dynamic Graph Embedding (DGE) models are unable to effectively capture the patterns unique to these networks, thus limiting the performance of DGE models in analyzing and predicting such social networks. Specifically, the paper points out: 1. **Limitations of existing negative sampling strategies**: - Current negative sampling strategies are mainly designed for traditional dynamic link prediction tasks and do not fully consider the unique characteristics of meme - stock - related social networks. - In these networks, the number of negative samples far exceeds that of positive samples, and many negative samples provide limited information because there may have been no interaction between users. This causes the model to focus too much on obvious non - connections during training and evaluation, while ignoring truly difficult - to - predict negative samples. 2. **Characteristics of meme - stock - related social networks**: - Interactions in these networks are not randomly or evenly distributed, but have specific time and interaction - type patterns, such as repeated interactions and loops. These characteristics are crucial for understanding the price fluctuations of meme stocks. - For example, predicting whether two users who have already interacted will interact again can reveal the continuous interest in a certain stock, which in turn affects the stock price. 3. **Proposed new methods**: - The paper proposes several negative sampling strategies based on domain knowledge to better capture the dynamic characteristics of meme - stock - related social networks. - These strategies include: - **Random sender and receiver**: By randomly replacing the sender and receiver nodes in positive samples, test the model's ability to predict whether an interaction will occur between any two nodes. - **Time sampling**: Generate negative samples at future time points to test the model's ability to predict whether node pairs that have already interacted will interact again. - **Negative self - loop**: Generate nodes that have not formed self - loops as negative samples to test the model's ability to predict the emergence of new self - loops. 4. **Comprehensive strategy**: - A joint negative sampling strategy (DINS) is proposed, which combines the above individual strategies and maintains the balance between positive and negative samples through positive enhancement to ensure that the model will not be biased towards predicting negative samples during the training process. Through these improvements, the paper aims to improve the prediction performance of DGE models in meme - stock - related social networks, especially the ability to capture and predict complex interaction dynamics.