Non-Almost-Derivable Frequent Itemsets Mining

Yang Xiaoming,Wang Zhibin,Liu Bing,Zhang Shouzhi
DOI: https://doi.org/10.1109/CIT.2005.144
2005-01-01
Abstract:The number of frequent itemsets is often too large to handle, so it is very necessary to work out a condensed representation of the collection of all frequent itemsets. In this paper, we propose a new condensed representation called frequent non-almost-derivable itemsets. This representation is a subset of the original collection of frequent itemsets. For any removed itemset X (which is called an frequent almost-derivable itemset), we can derive a lower and an upper bound of its support from this representation, and the lower bound and the upper bound is close enough (can be controlled by a user-defined parameter). We also propose an apriori-like algorithm, which can extract all frequent non-derivable itemsets. Extensive empirical results on real datasets show the compactness and good approximation of this representation
What problem does this paper attempt to address?