Multi-queries oriented closed sequential pattern mining over stream

Haifeng Li,Ning Zhang,Mo Hai,Yanmei Chai
2010-01-01
Journal of Information and Computational Science
Abstract:Sequential pattern mining is an important problem in data mining. Closed sequential patterns store the entire information of sequential patterns using less memory, thus being more suitable for stream mining. This paper considers a problem that how to mine closed sequential patterns with less cost when multiple queries exist. We propose an incremental closed sequential pattern mining method named SCSPM, which is performed over stream sliding window to obtain the entire closed sequential patterns so that multiple queries with different minimum supports are satisfied more easily; moreover, we present the backward update strategy according to the closed sequence property. In mining process, only closed sequential patterns are maintained to save the memory cost. In addition, we introduce a two-level hash index to quickly locate the existed closed sequential patterns; thus, the data generation is speeded up. Finally, we propose a novel sequence partition method to raise the computation efficiency with little accuracy lose, which addresses the expensive cost of subsequence generation from long sequence. Our experimental results on synthetic data show that SCSPM is effective and efficient. Copyright © 2010 Binary Information Press.
What problem does this paper attempt to address?