Influential Community Search over Large Heterogeneous Information Networks.
Yingli Zhou,Yixiang Fang,Wensheng Luo,Yunming Ye
DOI: https://doi.org/10.14778/3594512.3594532
IF: 2.5
2023-01-01
Proceedings of the VLDB Endowment
Abstract:Recently, the topic of influential community search has gained much attention. Given a graph, it aims to find communities of vertices with high importance values from it. Existing works mainly focus on conventional homogeneous networks, where vertices are of the same type. Thus, they cannot be applied to heterogeneous information networks (HINs) like bibliographic networks and knowledge graphs, where vertices are of multiple types and their importance values are of heterogeneity (i.e., for vertices of different types, their importance meanings are also different). In this paper, we study the problem of influential community search over large HINs. We introduce a novel community model, called heterogeneous influential community (HIC), or a set of closely connected vertices that are of the same type and high importance values, using the meta-path-based core model. An HIC not only captures the importance of vertices in a community, but also considers the influence on meta-paths connecting them. To search the HICs, we mainly consider meta-paths with two and three vertex types. Then, we develop basic algorithms by iteratively peeling vertices with low importance values, and further propose advanced algorithms by identifying the key vertices and designing pruning strategies that allow us to quickly eliminate vertices with low importance values. Extensive experiments on four real large HINs show that our solutions are effective for searching HICs, and the advanced algorithms significantly outperform baselines.