Assessing and ranking structural correlations in graphs.

Ziyu Guan,Jian Wu,Qing Zhang,Ambuj K. Singh,Xifeng Yan
DOI: https://doi.org/10.1145/1989323.1989421
2011-01-01
Abstract:Real-life graphs not only have nodes and edges, but also have events taking place, e.g., product sales in social networks and virus infection in communication networks. Among different events, some exhibit strong correlation with the network structure, while others do not. Such structural correlation will shed light on viral influence existing in the corresponding network. Unfortunately, the traditional association mining concept is not applicable in graphs since it only works on homogeneous datasets like transactions and baskets. We propose a novel measure for assessing such structural correlations in heterogeneous graph datasets with events. The measure applies hitting time to aggregate the proximity among nodes that have the same event. In order to calculate the correlation scores for many events in a large network, we develop a scalable framework, called gScore, using sampling and approximation. By comparing to the situation where events are randomly distributed in the same network, our method is able to discover events that are highly correlated with the graph structure. gScore is scalable and was successfully applied to the co-author DBLP network and social networks extracted from TaoBao.com, the largest online shopping network in China, with many interesting discoveries.
What problem does this paper attempt to address?