Overlapping Communities in Social Networks
Jan Dreier,Philipp Kuinke,Rafael Przybylski,Felix Reidl,Peter Rossmanith,Somnath Sikdar
DOI: https://doi.org/10.48550/arXiv.1412.4973
2014-12-18
Abstract:Complex networks can be typically broken down into groups or modules. Discovering this "community structure" is an important step in studying the large-scale structure of networks. Many algorithms have been proposed for community detection and benchmarks have been created to evaluate their performance. Typically algorithms for community detection either partition the graph (non-overlapping communities) or find node covers (overlapping communities).
In this paper, we propose a particularly simple semi-supervised learning algorithm for finding out communities. In essence, given the community information of a small number of "seed nodes", the method uses random walks from the seed nodes to uncover the community information of the whole network. The algorithm runs in time $O(k \cdot m \cdot \log n)$, where $m$ is the number of edges; $n$ the number of links; and $k$ the number of communities in the network. In sparse networks with $m = O(n)$ and a constant number of communities, this running time is almost linear in the size of the network. Another important feature of our algorithm is that it can be used for either non-overlapping or overlapping communities.
We test our algorithm using the LFR benchmark created by Lancichinetti, Fortunato, and Radicchi specifically for the purpose of evaluating such algorithms. Our algorithm can compete with the best of algorithms for both non-overlapping and overlapping communities as found in the comprehensive study of Lancichinetti and Fortunato.
Social and Information Networks,Physics and Society