Community Detection Algorithm Evaluation using Size and Hashtags

Paul Wagenseller,Feng Wang,Paul Wagenseller III
DOI: https://doi.org/10.48550/arXiv.1612.03362
2016-12-11
Social and Information Networks
Abstract:Understanding community structure in social media is critical due to its broad applications such as friend recommendations, link predictions and collaborative filtering. However, there is no widely accepted definition of community in literature. Existing work use structure related metrics such as modularity and function related metrics such as ground truth to measure the performance of community detection algorithms, while ignoring an important metric, size of the community. [1] suggests that the size of community with strong ties in social media should be limited to 150. As we discovered in this paper, the majority of the communities obtained by many popular community detection algorithms are either very small or very large. Too small communities don't have practical value and too large communities contain weak connections therefore not stable. In this paper, we compare various community detection algorithms considering the following metrics: size of the communities, coverage of the communities, extended modularity, triangle participation ratio, and user interest in the same community. We also propose a simple clique based algorithm for community detection as a baseline for the comparison. Experimental results show that both our proposed algorithm and the well-accepted disjoint algorithm InfoMap perform well in all the metrics.
What problem does this paper attempt to address?