Improving Community Detection in Blockmodel by Distance-Based Observation Selection

Cunqi Shao,Mincheng Wu,Shibo He
DOI: https://doi.org/10.1016/j.physa.2024.130125
2024-01-01
Abstract:Community detection is an important research topic in complex systems and has plenty of applications in real-world networks. Probabilistic methods, such as the Expectation-Maximization (EM), are developed to classify nodes that have similar connection patterns in a network based on blockmodels. However, the detection procedures in these models are typically started from randomly generated initial community distributions without prior knowledge. In biological and social networks, there are practical measures to obtain prior knowledge for a subset of nodes, such as local observations. These facts lead us to question how we can select a subset of nodes with known community labels to enhance the accuracy of the EM method. The current selection methods lack the relationship between detection accuracy and structural characteristics and most approaches consider the nodes as the center of communities, which is not suitable for block models. In this paper, we first study the relationships between the structural distance and detection accuracy without prior knowledge. Then we propose a distance-based indicator to describe the performance of the observation node set in the EM method. Finally, we introduce a scoring method based on the indicator to select a partial observation set, improving the accuracy of community detection using the EM method. Empirical results from synthetic and real-world networks corroborate that the proposed indicator could contribute to a better performance in kinds of scenarios.
What problem does this paper attempt to address?