An efficient algorithm for mining high utility itemsets from data streams based on sliding window techniques

Shiming GUO,Hong GAO
DOI: https://doi.org/10.11990/jheu.201611075
2018-01-01
Abstract:Existing algorithms for HUIM over a sliding window have two problems:the number of candidates is usu-ally very large and extensive memory is required,and candidate verification is time-consuming.Thus,in this paper an efficient HUISW algorithm(high utility itemset mining over a siding window)for mining high utility itemsets from a data stream without candidates is proposed.HUISW adopts a novel tree structure HUIL-Tree(a high utility itemset tree that arranges items according to lexicographic order)to store the information on the itemsets in a sliding window,and a utility database to store the utility information on the itemsets in the transactions of a window.Dur-ing the mining process,the pattern-growth method was used to generate itemsets from HUIL-Tree.For each itemset generated,its utility in the window was calculated directly using the corresponding relationship between the itemset and the utility database.The whole process did not generate candidates.Extensive experiments on both sparse and dense stream datasets were performed to compare HUISW with the state-of-the-art algorithm SHU-Growth(siding window based high utility growth).The experimental results show that HUISW significantly outperforms SHU-Growth as the runtime of HUISW was two orders of magnitude faster.
What problem does this paper attempt to address?