Elsnc: A Semi-Supervised Community Detection Method with Integration of Embedding- Enhanced Links and Node Content in Attributed Networks
Jinxin Cao,Xiaoyang Zou,Weizhong Xu,Weiping Ding,Hengrong Ju,Lu Liu,Fuxiang Chen,Di Jin
DOI: https://doi.org/10.1016/j.asoc.2024.112250
2024-01-01
Abstract:In complex network analysis, detecting communities is becoming increasingly important. However, it is difficult to fuse multiple types of information to enhance the community-detection performance in real-world applications. Besides the nodes and the edges, a network also contains the structure of communities, its networking topological structure, and the network embeddings. Note that existing works on community detection have limited usage of all these information types in combination. In this work, we designed a novel unified model called embedding-enhanced link-based semi-supervised community detection with node content (ELSNC). ELSNC integrates the structure of the topology, the priori information, the network embeddings, and the node content. First, we employ two non-negative matrix factorization (NMF)-based stochastic models to characterize the node-community membership and the content-community membership (by performing similarity detection between a topic model and the NMF). Second, we introduce the nodes' and networking embeddings' topological similarity into the model as topological information. To model the topological similarity, we introduce a strong constraint (i.e., i . e ., the priori information) and apply matrix completion to identify the community membership with the network embeddings' representation ability. Finally, we present a semi-supervised community-detection method based on NMF that combines the network topology, content information, and the network embeddings. Our work's innovation can be captured in two points: 1) As a type of semi-supervised community detection method, we extend the theory of semi-supervised methods on attributed networks and propose a unified model that integrates multiple information types. 2) The community membership obtained by the unified model simultaneously contains different information, including the topological, content, priori, and embedding information, which can more robustly be explored in the community structure in real-world scenarios. Furthermore, we performed a comprehensive evaluation of our proposed approach compared with state-of-the-art methods on both synthetic and real-world networks. The results show that our proposed method significantly outperformed the baseline methods.