A Mining Maximal Frequent Itemsets over the Entire History of Data Streams

Yinmin Mao,Hong Li,Lumin Yang,Zhigang Chen,Lixin Liu
DOI: https://doi.org/10.1109/dbta.2009.125
2009-01-01
Abstract:Mining maximal frequent itemsets has been widely concerned. However, mining data streams is more difficult than mining static databases because of the huge, high-speed and continuous characteristics of streaming data. This paper presents an algorithm, called IDSM-MFI. The algorithm uses a synopsis data structure to store the items of transactions embedded data streams so far. It adopts a top-bottom and bottom-top method to mine the set of all maximal frequent itemsets in landmark windows over data stream, which can be output in real time based on users' specified thresholds. Theoretical analysis and experimental results show that our algorithm is efficient and scalable for mining the set of all maximal frequent itemsets over the entire history of data stream.
What problem does this paper attempt to address?