Abstract:The Apriori algorithm is one of the most well-known and widely accepted methods for the association rule mining. In Apriori, it uses a prefix tree to represent k-itemsets, generates k-itemset candidates based on the frequent (k-1)-itemsets, and determines the frequent k-itemsets by traversing the prefix tree iteratively based on the transaction records. When k is small, the execution of Apriori is very efficient. However, the execution of Apriori could be very slow when k becomes large because of the deeper recursion depth to determine the frequent k-itemsets. From the perspective of graph computing, the transaction records can be converted to a graph G(V,E), where V is the set of vertices of G that represents the transaction records and E is the set of edges of G that represents the relations among transaction records. Each k-itemset in the transaction records will have a corresponding connected component in G. The number of vertices in the corresponding connected component is the support of the k-itemset. Since the time to find the corresponding connected component of a k-itemset in G is constant for any k, the graph computing method will be very efficient if the number of k-itemsets is relatively small. Based on Apriori and graph computing techniques, a hybrid method, called Apriori and Graph Computing (ANG), is proposed to compute the frequent itemsets. Initially, ANG uses Apriori to compute the frequent k-itemsets and then switches to the graph computing method when k becomes large (where the number of k-itemset candidates is relatively small). The experimental results show that ANG outperforms both Apriori and the graph computing method for all test cases.

Research on Association Rules Mining Algorithm Based on Hadoop-Taking Apriori as an Example

An improved parallel association rules algorithm based on MapReduce framework for big data

Distributed Affinity Propagation Clustering Based on MapReduce

A Set Operation Based Algorithm for Association Rules Mining

A Dynamic Approach Based on Apriorilike Algorithm for Mining Association Rules

Research on algorithm of association rules in Distributed Database System

Dtgc-Tree: A New Strategy Of Association Rules Mining

Parallel mining and application of fuzzy association rules

Performance Analysis of Apriori Algorithm with Different Data Structures on Hadoop Cluster

Observations on Factors Affecting Performance of MapReduce based Apriori on Hadoop Cluster

Density-Based Mining of Quantitative Association Rules

Optimization Design of Apriori Algorithm Based on Big Data Analysis and Cloud Computing

Parallel Algorithm for Mining Fuzzy Association Rules

An Improved Apriori Algorithm in Association Rules

Association Rules Mining Of Silk Relics Database With The Rcfp-Growth Algorithm

ANG: a Combination of Apriori and Graph Computing Techniques for Frequent Itemsets Mining

Algorithms Research for Re-mining of Association Rules Based on Support Changes

Node Association Rule Mining in Wireless Sensor Networks

Improvement of Apriori Algorithm Using Parallelization Technique on Multi-CPU and GPU Topology

Visual Analysis of User-Driven Association Rule Mining

An Association Mining Algorithm Based on Matrix