Community detection in large‐scale networks: a survey and empirical evaluation

Steve Harenberg,Gonzalo Bello,L. Gjeltema,Stephen Ranshous,Jitendra Harlalka,Ramona Seay,Kanchana Padmanabhan,Nagiza Samatova
DOI: https://doi.org/10.1002/wics.1319
2014-07-22
WIREs Computational Statistics
Abstract:Community detection is a common problem in graph data analytics that consists of finding groups of densely connected nodes with few connections to nodes outside of the group. In particular, identifying communities in large‐scale networks is an important task in many scientific domains. In this review, we evaluated eight state‐of‐the‐art and five traditional algorithms for overlapping and disjoint community detection on large‐scale real‐world networks with known ground‐truth communities. These 13 algorithms were empirically compared using goodness metrics that measure the structural properties of the identified communities, as well as performance metrics that evaluate these communities against the ground‐truth. Our results show that these two types of metrics are not equivalent. That is, an algorithm may perform well in terms of goodness metrics, but poorly in terms of performance metrics, or vice versa. WIREs Comput Stat 2014, 6:426–439. doi: 10.1002/wics.1319 This article is categorized under: Algorithms and Computational Methods > Algorithms Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and Classification Data: Types and Structure > Graph and Network Data
What problem does this paper attempt to address?