GNN Applied to Ego-nets for Friend Suggestions

Evgeny Zamyatin
2024-12-16
Abstract:A major problem of making friend suggestions in social networks is the large size of social graphs, which can have hundreds of millions of people and tens of billions of connections. Classic methods based on heuristics or factorizations are often used to address the difficulties of scaling more complex models. However, the unsupervised nature of these methods can lead to suboptimal results. In this work, we introduce the Generalized Ego-network Friendship Score framework, which makes it possible to use complex supervised models without sacrificing scalability. The main principle of the framework is to reduce the problem of link prediction on a full graph to a series of low-scale tasks on ego-nets with subsequent aggregation of their results. Here, the underlying model takes an ego-net as input and produces a pairwise relevance matrix for its nodes. In addition, we develop the WalkGNN model which is capable of working effectively in the social network domain, where these graph-level link prediction tasks are heterogeneous, dynamic and featureless. To measure the accuracy of this model, we introduce the Ego-VK dataset that serves as an exact representation of the real-world problem that we are addressing. Offline experiments on the dataset show that our model outperforms all baseline methods, and a live A/B test demonstrates the growth of business metrics as a result of utilizing our approach.
Social and Information Networks,Artificial Intelligence
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the challenge of friend recommendation in social networks, especially the link prediction problem in large - scale social graphs. Specifically, social networks usually contain hundreds of millions of users and billions of connections, which makes it difficult for traditional methods to scale to such a large amount of data. In addition, although traditional heuristic - or factorization - based methods can handle large - scale data, their unsupervised nature may lead to sub - optimal results. To solve these problems, the author proposes a framework named "Generalized Ego - network Friendship Score". The core idea of this framework is to simplify the link prediction problem on the entire graph into a series of small - scale tasks on the ego - net (ego - centric network) and then aggregate the results of these tasks. The ego - net refers to a sub - graph centered on a certain node, including all its neighbors and the connections between them. Through this method, complex supervised models can be used without sacrificing scalability. To further improve the performance of the model, the author also develops a model named WalkGNN. WalkGNN is a second - order graph neural network (second - order GNN), which can effectively handle heterogeneous, dynamic and graph - level link prediction tasks without node features in the field of social networks. The key component of WalkGNN is the WalkConv layer, which regards each edge as an information filter and passes the state of the node pair through it. To verify the effectiveness of the model, the author introduces the Ego - VK dataset, which is a real - world dataset extracted from the VK social network and is used to evaluate the performance of the model in actual scenarios. The experimental results show that the WalkGNN model outperforms all baseline methods in offline experiments and also shows an increase in business metrics in online A/B tests. In summary, this paper aims to solve the challenges of friend recommendation in social networks and proposes a new framework and model that can improve the accuracy of link prediction while maintaining scalability.