I/O-efficient Calculation of H-group Closeness Centrality over Disk-Resident Graphs.

Junzhou Zhao,Pinghui Wang,John C. S. Lui,Don Towsley,Xiaohong Guan
DOI: https://doi.org/10.1016/j.ins.2017.03.017
IF: 8.1
2017-01-01
Information Sciences
Abstract:We introduce H-group closeness centrality in this work. H-group closeness centrality of a group of nodes measures how close this node group is to other nodes in a graph, and can be used in numerous applications such as measuring the importance and influence of a group of users in a social network. When a large graph contains billions of edges that cannot reside entirely in a computers main memory, computing and maximizing H-group closeness centrality both become challenging. In this work, we present a systematic solution for efficiently computing and maximizing H-group closeness centrality in large disk-resident graphs, only using a common PC. Our solution leverages a probabilistic counting method to efficiently estimate H-group closeness with high accuracy, rather than exhaustively computing it in an exact fashion. Furthermore, we design an I/O-efficient greedy algorithm to find a node group that approximately maximizes H-group closeness centrality in a graph. Our algorithm exploits several appealing properties of the defined H-group closeness measure to reduce the computational cost of handling a disk-resident graph. Extensive evaluation on large real-world and synthetic graphs demonstrates the efficiency of our proposed method. For example, our proposed I/O-efficient greedy algorithm is about 300 times faster than a simple multi-pass method on the Twitter graph with 1.4 billion edges. This reduces the running time of identifying one group member from nearly an hour to less than 20s on average.
What problem does this paper attempt to address?