Parallel Sequential Pattern Mining by Transaction Decomposition

Xueqiang Wang,Jing Wang,Tengjiao Wang,Hongyan Li,Dongqing Yang
DOI: https://doi.org/10.1109/fskd.2010.5569404
2010-01-01
Abstract:Sequential pattern mining is an important and useful tool with broad applications, such as analyzing customer purchase behavior, recommending services to customers, and so on. It is challenging since explosive number of subsequences need to be examined and both the memory and computational cost are becoming extremely expensive when the sequence database grows huge. Many previous algorithms developed for efficient mining of sequential patterns encounter problems to deal with large scale data. In this paper, we propose a parallel sequential pattern mining method, called PTDS (i.e., Parallel Transaction-Decomposed Sequential pattern mining), which decomposes transactions to mine sequential patterns. PTDS greatly accelerates pattern growth and improves the efficiency of parallel algorithm on large scale data. We experiment on a large dataset consisting of 16 million service purchase sequences. Besides scalability, the empirical comparisons show that PTDS consistently outperforms both the PrefixSpan-based parallel method and serial algorithm.
What problem does this paper attempt to address?