Giceberg: Towards Iceberg Analysis in Large Graphs

Nan Li,Ziyu Guan,Lijie Ren,Jian Wu,Jiawei Han,Xifeng Yan
DOI: https://doi.org/10.1109/icde.2013.6544894
2013-01-01
Abstract:Traditional multi-dimensional data analysis techniques such as iceberg cube cannot be directly applied to graphs for finding interesting or anomalous vertices due to the lack of dimensionality in graphs. In this paper, we introduce the concept of graph icebergs that refer to vertices for which the concentration (aggregation) of an attribute in their vicinities is abnormally high. Intuitively, these vertices shall be “close” to the attribute of interest in the graph space. Based on this intuition, we propose a novel framework, called gIceberg, which performs aggregation using random walks, rather than traditional SUM and AVG aggregate functions. This proposed framework scores vertices by their different levels of interestingness and finds important vertices that meet a user-specified threshold. To improve scalability, two aggregation strategies, forward and backward aggregation, are proposed with corresponding optimization techniques and bounds. Experiments on both real-world and synthetic large graphs demonstrate that gIceberg is effective and scalable.
What problem does this paper attempt to address?