Abstract:High utility itemsets are sets of items that have a high utility (e.g. a high profit or a high importance) in a transaction database. Discovering high utility itemsets has many important applications in real-life such as market basket analysis. Nonetheless, mining these patterns is a time-consuming process due to the huge search space and the high cost of utility computation. Most of previous work is devoted to search space pruning but pay little attention to utility computation. Factually, not only search space pruning but also high utility itemset identification have to resort to the computation of various utilities. This paper proposes a novel algorithm named REX (Rapid itEmset eXtraction), which extends the classic d<math>2</math>HUP algorithm with an improved structure, a <math>k</math>-item utility machine, and an efficient switch strategy. The structure can significantly reduce the time complexity of utility computation compared with the original structure used in d<math>2</math>HUP. The machine can quickly merge identical transactions and applies an efficient procedure for computing the utilities of extensions of a given itemset. The strategy derived from trial and error drastically gives rise to performance improvement on some databases and is also competitive with the switch strategy used in d<math>2</math>HUP on other databases. Experimental results show that REX achieves a speedup of from fifty percent to three orders of magnitude over d<math>2</math>HUP even though they use identical pruning techniques and that REX considerably outperforms state-of-the-art algorithms on real-life and synthetic databases.

Fast mining erasable itemsets using NC_sets

An Efficient Algorithm For Mining Erasable Itemsets

Mining top-rank-K erasable itemsets

Mining Noise-Tolerant Frequent Closed Itemsets in Very Large Database.

Gc-Tree: A Fast Online Algorithm For Mining Frequent Closed Itemsets

A Bitmap Approach for Mining Erasable Itemsets

Generic Itemset Mining Based on Reinforcement Learning

Mining Top-Rank-kErasable Itemsets by PID_lists

Dtgc-Tree: A New Strategy Of Association Rules Mining

An Efficient Structure for Fast Mining High Utility Itemsets

An efficient approach for interactive mining of frequent itemsets

Fast mining frequent itemsets using Nodesets.

An efficient mining scheme for high utility itemsets

An Efficient Data Structure for Fast Mining High Utility Itemsets

A New Algorithm for Fast Mining Frequent Itemsets Using N-lists

Mining high utility itemsets using extended chain structure and utility machine

MRI-CE: Minimal rare itemset discovery using the cross-entropy method

FHUQI-Miner: Fast high utility quantitative itemset mining

Mining High Occupancy Itemsets.

Multiobjective-integer-programming-based Sensitive Frequent Itemsets Hiding.

PrePost+