LessMine: Reducing Sample Space and Data Access for Dense Pattern Mining

Tianyu Fu,Ziqian Wan,Guohao Dai,Yu Wang,Huazhong Yang
DOI: https://doi.org/10.1109/hpec43674.2020.9286187
2020-01-01
Abstract:In the era of “big data”, graph has been proven to be one of the most important reflections of real-world problems. To refine the core properties of large-scale graphs, dense pattern mining plays a significant role. Because of the complexity of pattern mining problems, conventional implementations often lack scalability, consuming much time and memory space. Previous work (e.g., ASAP [1]) proposed approximate pattern mining as an efficient way to extract structural information from graphs. It demonstrates dramatic performance improvement by up to two orders of magnitude. However, we observe three main flaws of ASAP in cases of dense patterns, thus we propose LessMine, which reduces the sample space and data access for dense pattern mining. We introduce the reorganization of data structure, the method of concurrent sample, and uniform close. We also provide locality-aware partition for distributed settings. The evaluation shows that our design achieves up to 1829 × speedup with 66% less error rate compared with ASAP.
What problem does this paper attempt to address?