Research on Parallel Mining Frequent Patterns with Taboo Constraints

薛胜军,赵洪昌
DOI: https://doi.org/10.3963/j.issn.1671-4431.2013.03.026
2013-01-01
Wuhan Ligong Daxue Xuebao/Journal of Wuhan University of Technology
Abstract:Data mining technology based on frequent pattern is widespread used on finding association rules and has gradually become one of the hot research field of data mining. The researchers found that the traditional frequent pattern mining algorithm will produce a large number of intermediate data and the results which users are not interested in. These data in terms of computation and storage overhead is undoubtedly a challenge in rapid developing of massive data mining, and seriously affected the mining efficiency and accuracy. To address this problem, the paper combined with the current popular Hadoop technology has done some analysis and researches on traditional frequent pattern mining algorithms and has proposed a cloud data mining algorithm based on frequent pattern with the taboo constrained. The algorithm uses the Hadoop framework to restrain the length and attributes of the pattern in the frequent pattern data mining process and distributed parallel complete the mining tasks. And the experimental results show that the algorithm has more advantages than the traditional algorithms in terms of mass data mining.
What problem does this paper attempt to address?