Distributed Frequent Closed Itemsets Mining.

Chun Liu,Zheng,Kai-Yuan Cai,Shichao Zhang
DOI: https://doi.org/10.1109/sitis.2007.64
2007-01-01
Abstract:As many large organizations have multiple data sources and the scale of dataset becomes larger and larger, it is inevitable to carry out data mining in the distributed environment. In this paper, we address the problem of mining global frequent closed itemsets in distributed environment. A novel algorithm is proposed to obtain global frequent closed itemsets with exact frequency and it is shown that the algorithm can determine all the global frequent closed itemsets. A new data structure is developed to maintain the closed itemsets. Then an efficient implementation is provided based on the structure. Experimental results show that the proposed algorithm is effective.
What problem does this paper attempt to address?