An Efficient Parallel High Utility Sequential Pattern Mining Algorithm

Chunkai Zhang,Yiwen Zu,Junli Nie,Linzi Du
DOI: https://doi.org/10.1109/hpcc/smartcity/dss.2019.00392
2019-01-01
Abstract:High utility sequential pattern mining (HUSPM) is an emerging research topic in pattern mining. Aiming at this problem, many efficient algorithms have been proposed in recent years. However, most of them are serial algorithms, which assume that the entire database (and the data structures) can completely fit into memory. Thus, two conspicuous drawbacks attract our attention. Firstly, in the big data era, it is hard to put the large scale database completely into the memory. Secondly, the serial algorithms are unable to process the real-time data quickly. To address these shortcomings, in this paper, we propose an efficient distributed algorithm for HUSPM. The proposed algorithm takes the advantages of multiple machines and multi-core processer, so it can quickly compete the mining task. The major contributions are summarized as follows. First, we propose an algorithm based on the SLST strategy to partition the original database. Second, we propose a multi-threading algorithm for HUSPM. Finally, based on the two points, an efficient distributed algorithm, named PHUSP, is proposed. Our experiments show the proposed algorithm is much faster than the state-of-the-art algorithms.
What problem does this paper attempt to address?