SGES: A General and Space-efficient Framework for Graphlet Counting in Graph Streams

Chen Yang,Lailong Luo,Yuliang Lu,Chu Huang,Qianzhen Zhang,Guozheng Yang,Deke Guo
DOI: https://doi.org/10.1145/3627673.3679739
2024-01-01
Abstract:Graphlets are small, connected, and non-isomorphic induced subgraphs that describe the topological structure of a graph. Counting graphlets is a fundamental task in graph mining and social network analysis. It has numerous applications in many fields, including dense subgraph discovery, anomaly detection, etc. Most existing work assumes a static graph. However, graphs are dynamic in the real world, which can be described as graph streams. Counting graphlets in graph streams is a challenge due to the streaming nature of the input. While there have been several studies on counting graphlets in graph streams, these works are limited to simple graphlets like triangles and butterflies. In this paper, we propose SGES algorithm to estimate more complex graphlets in graph streams. In SGES, we first propose an unbiased sampling strategy to maintain fixed-size sampled edges, which in turn allows us to unbiasedly estimate the number of subgraphs and then count graphlets based on the combinational relationship between the number of subgraphs and the number of graphlets. Extensive experiments over large real-world graph streams prove that our algorithm can obtain accurate estimation values of graphlet counts with high throughput.
What problem does this paper attempt to address?