Mining Frequent Itemsets Based on Equivalent Classes in Large Databases

Jianxu MAO,Jianpin Mao,Xiaoling Yao,Caiping LIU
2011-01-01
Abstract:Finding frequent itemsets is one of the most basic problems in data mining. The large amounts of data make the traditional algorithms for frequent patterns mining difficult to extend to large databases. According to characteristic of large databases, inspired by the fact that the FP-growth provides an effective algorithm, a new EFP-growth for mining frequent patterns in large databases is proposed. Based on the characteristic of equivalent classes , which separate item sets of association rules into many subsets , proposed algorithm divides a large database into many projection subsets and carries out constrained frequent. Experiments show that the algorithm has accelerated the mining speed and the performance of space scalability is superior to the FP-growth algorithm. Moreover, the algorithm has a very good time and space scalability with the increasing size of database.
What problem does this paper attempt to address?