Research on Association Rules Mining Algorithm Based on Hadoop-Taking Apriori as an Example

Mu-lin LIU,Qing-hua ZHU
DOI: https://doi.org/10.3969/j.issn.1673-629X.2016.07.001
2016-01-01
Abstract:In order to solve the problem that the traditional association rules mining algorithm has been unable to meet the mining needs of large amount of data in the aspect of efficiency and scalability,take Apriori as an example,the algorithm is realized in the parallelization based on Hadoop framework and MapReduce model. On the basis,it is improved using the transaction reduce method for further enhance-ment of the algorithm's mining efficiency. The experiment,which consists of verification of parallel mining results,comparison on effi-ciency between serials and parallel,variable relationship between mining time and node number and between mining time and data a-mounts,is carried out in the mining results and efficiency by Hadoop clustering. Experiments show that the paralleled Apriori algorithm implemented is able to accurately mine frequent item sets,with a better performance and scalability. It can be better to meet the require-ments of big data mining and efficiently mine frequent item sets and association rules from large dataset.
What problem does this paper attempt to address?