Online Mining of Maximal Frequent Itemsequences from Data Streams

Guojun Mao,Xindong Wu,Chunnian Liu,Xingquan Zhu,Gong Chen,Yue Sun,Xu Liu
2005-01-01
Abstract:Mining data streams often requires real-time extraction of interesting patterns from dynamic and continuously growing data. This requirement has imposed challenges on discovering and outputting current useful patterns in an instant way, commonly referred to as online streaming data mining. In this paper, we present INSTANT, a novel algorithm that explores maximal frequent itemsequences from streaming data in an online fashion. We first provide useful operators on the lattice of itemsequential sets, and then apply them to the algorithm design of INSTANT. In comparison with the most popular methods such as close-itemset based mining algorithms, INSTANT has solid theoretical foundations to ensure that it employs more compact in-memory data structures than closed itemse- quences. Experimental results show that our method can achieve better results than previous related methods in terms of both time and space efficiency. This paper aims at online mining of frequent itemsequences from data streams. We will present an efficient algo- rithm called INSTANT (maxImal frequeNt So-far iTemsequence mAiNTainer), which is based on a new mining the- ory provided by this paper. The paper will also discuss the performance of the proposed algorithm from both theo- retic and experimental views.
What problem does this paper attempt to address?