T-FSM: A Task-Based System for Massively Parallel Frequent Subgraph Pattern Mining from a Big Graph.

Lyuheng Yuan,Da Yan,Wenwen Qu,Saugat Adhikari,Jalal Khalil,Cheng Long,Xiaoling Wang
DOI: https://doi.org/10.1145/3588928
2023-01-01
Abstract:Finding frequent subgraph patterns in a big graph is an important problem with many applications such as classifying chemical compounds and building indexes to speed up graph queries. Since this problem is NP-hard, some recent parallel systems have been developed to accelerate the mining. However, they often have a huge memory cost, very long running time, suboptimal load balancing, and possibly inaccurate results. In this paper, we propose an efficient system called T-FSM for parallel mining of frequent subgraph patterns in a big graph. T-FSM adopts a novel task-based execution engine design to ensure high concurrency, bounded memory consumption, and effective load balancing. It also supports a new anti-monotonic frequentness measure called Fraction-Score, which is more accurate than the widely used MNI measure. Our experiments show that T-FSM is orders of magnitude faster than SOTA systems for frequent subgraph pattern mining. Our system code has been released at https://github.com/lyuheng/T-FSM.
What problem does this paper attempt to address?