Detecting Statistically Significant Communities.

Zengyou He,Hao Liang,Zheng Chen,Can Zhao,Yan Liu
DOI: https://doi.org/10.1109/tkde.2020.3015667
IF: 9.235
2020-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Community detection is a key data analysis problem across different fields. During the past decades, numerous algorithms have been proposed to address this issue. However, most work on community detection does not address the issue of statistical significance. Although some research efforts have been made towards mining statistically significant communities, deriving an analytical solution of p-value for one community under the configuration model is still a challenging mission that remains unsolved. The configuration model is a widely used random graph model in community detection, in which the degree of each node is preserved in the generated random networks. To partially fulfill this void, we present a tight upper bound on the p-value of a single community under the configuration model, which can be used for quantifying the statistical significance of each community analytically. Meanwhile, we present a local search method to detect statistically significant communities in an iterative manner. Experimental results demonstrate that our method is comparable with the competing methods on detecting statistically significant communities.
What problem does this paper attempt to address?