C&C: An Effective Algorithm For Extracting Web Community Cores

Xianchao Zhang,Yueting Li,Wenxin Liang
DOI: https://doi.org/10.1007/978-3-642-14589-6_32
2010-01-01
Abstract:Communities is a significant pattern of the Web. A community is a group of pages related to a common topic. Web communities are able to be characterized by dense bipartite subgraphs. Each community almost surely contains at least one core. A core is a complete bipartite graph (CBG). Focusing on the issues of extracting such community cores from the Web, in this paper we propose an effective C & C algorithm based on combination and consolidation to extract all embedded cores in web graphs. Experiments on real and large data collections demonstrate that the proposed algorithm C & C is efficient and effective for the community core extraction because: 1) all the largest emerging cores can be identified; 2) identifying all the embedded cores with different sizes only requires one-pass execution of C & C; 3) the extraction process needs no user-determined parameters in C & C.
What problem does this paper attempt to address?