A Label-Based Partitioning Strategy for Mining Link Patterns

Cuifang Zhao,Xiang Zhang,Peng Wang
DOI: https://doi.org/10.1109/kicss.2012.15
2012-01-01
Abstract:As the explosive growth of online linked data, the task of mining link patterns attracts more and more attention. A practical issue is how to perform mining efficiently in large-scale linked data. Existing pattern mining algorithms usually assume that the dataset can fit into the main memory, while linked data of billion triples is far beyond the memory limitation. In this paper we give a pilot study of a novel partitioning strategy for mining link patterns in large-scale linked data. First, we propose an algorithm named ParGroup to divide and group large linked data to partitions based on vertex label; Second, an adapted gSpan is applied for mining link patterns in each partition; At last, discovered link patterns are merged into a global result set. Experiments show that our strategy is feasible and promising in some scenarios.
What problem does this paper attempt to address?