Abstract:Frequent itemset mining has been studied extensively in literature. Most previous studies require the specification of a min/spl I.bar/support threshold and aim at mining a complete set of frequent itemsets satisfying min/spl I.bar/support. However, in practice, it is difficult for users to provide an appropriate min/spl I.bar/support threshold. In addition, a complete set of frequent itemsets is much less compact than a set of frequent closed itemsets. In this paper, we propose an alternative mining task: mining top-k frequent closed itemsets of length no less than min/spl I.bar/l, where k is the desired number of frequent closed itemsets to be mined, and min/spl I.bar/l is the minimal length of each itemset. An efficient algorithm, called TFP, is developed for mining such itemsets without mins/spl I.bar/support. Starting at min/spl I.bar/support = 0 and by making use of the length constraint and the properties of top-k frequent closed itemsets, min/spl I.bar/support can be raised effectively and FP-Tree can be pruned dynamically both during and after the construction of the tree using our two proposed methods: the closed node count and descendant/spl I.bar/sum. Moreover, mining is further speeded up by employing a top-down and bottom-up combined FP-Tree traversing strategy, a set of search space pruning methods, a fast 2-level hash-indexed result tree, and a novel closed itemset verification scheme. Our extensive performance study shows that TFP has high performance and linear scalability in terms of the database size.

Parallel mining of top-k frequent itemsets in very large text database

An Efficient Method for the Parallel Mining of Frequent Itemsets in Very Large Text Databases

Mining Noise-Tolerant Frequent Closed Itemsets in Very Large Database.

Efficient Top-k Frequent Itemset Mining on Massive Data

TFP: an Efficient Algorithm for Mining Top-K Frequent Closed Itemsets

Gc-Tree: A Fast Online Algorithm For Mining Frequent Closed Itemsets

A STABLE PARALLEL DISTRIBUTED FREQUENT ITEMSET MINING ALGORITHM AND ITS APPLICATION

Efficient Parallel Frequent Itemsets Mining Algorithm

HPFP-Miner: A Novel Parallel Frequent Itemset Mining Algorithm

BitTableFI: An efficient mining frequent itemsets algorithm

A Novel Parallel Algorithm for Frequent Itemsets Mining in Massive Small Files Datasets

PNPFI: An Efficient Parallel Frequent Itemsets Mining Algorithm

Efficiently Mining Frequent Itemsets on Massive Data

Efficient Top-K High Utility Itemset Mining on Massive Data

Mining Maximum Length Frequent Itemsets: A Summary of Results

Fast Mining of Global Maximum Frequent Itemsets

A New Algorithm for Frequent Itemsets Mining Based on Apriori and FP-Tree

Frequent itemset mining with parallel RDBMS

AT-Mine: an Efficient Algorithm of Frequent Itemset Mining on Uncertain Dataset.

Efficient Probabilistic Frequent Itemset Mining In Big Sparse Uncertain Data

Efficient High-Utility Occupancy Itemset Mining Algorithm on Massive Data