A comparative study of similarity-based and GNN-based link prediction approaches

Md Kamrul Islam,Sabeur Aridhi,Malika Smail-Tabbone
DOI: https://doi.org/10.48550/arXiv.2008.08879
2020-08-20
Abstract:The task of inferring the missing links in a graph based on its current structure is referred to as link prediction. Link prediction methods that are based on pairwise node similarity are well-established approaches in the literature. They show good prediction performance in many real-world graphs though they are heuristics and lack of universal applicability. On the other hand, the success of neural networks for classification tasks in various domains leads researchers to study them in graphs. When a neural network can operate directly on the graph, then it is termed as the graph neural network (GNN). GNN is able to learn hidden features from graphs which can be used for link prediction task in graphs. Link predictions based on GNNs have gained much attention of researchers due to their convincing high performance in many real-world graphs. This appraisal paper studies some similarity and GNN-based link prediction approaches in the domain of homogeneous graphs that consists of a single type of (attributed) nodes and single type of pairwise links. We evaluate the studied approaches against several benchmark graphs with different properties from various domains.
Social and Information Networks,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the task of predicting missing links in graph structures. Specifically, the author compares the performance of node - similarity - based methods and graph neural network (GNN) - based methods in the link prediction task. Link prediction refers to inferring the missing links in a graph according to the current graph structure. Similarity - based methods predict links by calculating the similarity between nodes, while GNN - based methods use neural networks to operate directly on the graph and learn the hidden features in the graph for link prediction. The author aims to evaluate the performance of these two methods on different types of graphs and analyze their advantages and limitations. The paper mentions that although similarity - based methods show good prediction performance in many real - world graphs, these methods are mostly heuristic methods and lack universal applicability. On the other hand, GNN - based methods have received extensive attention in recent years due to their success in classification tasks in multiple fields, especially showing convincing high performance when dealing with complex graph structures. The author selects several similarity - based and GNN - based link prediction methods and evaluates their performance on different benchmark graphs, mainly focusing on two aspects: prediction accuracy and computation time. Through such research, the author hopes to provide guidance for choosing appropriate link prediction methods, especially when dealing with homogeneous graphs (i.e., graphs that only contain one type of node and link).