Mining sequential patterns with wildcards and the One-Off condition

WU Xin-Dong,Fei Xie,HUANG Yong-Ming,HU Xue-Gang,Jun Gao
DOI: https://doi.org/10.3724/SP.J.1001.2013.04422
2013-01-01
Ruan Jian Xue Bao/Journal of Software
Abstract:There is a huge wealth of sequence data available in real-world applications. The task of sequential pattern mining serves to mine important patterns from the sequence data. Given a sequence S, a certain threshold, and gap constraints, this paper aims to discover frequent patterns whose supports in S are no less than the given threshold value. There are flexible wildcards in pattern P, and the number of the wildcards between any two successive elements of P fulfills the user-specified gap constraints. The study designs an efficient mining algorithm: One-Off Mining, whose mining process satisfies the One-Off condition under which each character in the given sequence can be used at most once in all occurrences of a pattern. Experiments on DNA sequences show that this method performs better in time and completeness than the related sequential pattern mining algorithms. © 2013 ISCAS.
What problem does this paper attempt to address?