Abstract:In recent years, knowledge discovery in databases provides a powerful capability to discover meaningful and useful information. For numerous real-life applications, frequent pattern mining and association rule mining have been extensively studied. In traditional mining algorithms, data are centralized and memory-resident. As a result of the large amount of data, bandwidth limitation, and energy limitations when applying these methods to distributed databases, especially in this era of big data, the performance is not effective enough. Hence, data mining on distributed environments has emerged as an important research area. To improve the performance, we propose a set of algorithms based on FP growth that discover FPs that are capable of providing fast and scalable service in distributed computing environments and a brief data structure to store items and counts to minimize the data for transmission on the network. To ensure completeness and execution capability, DistEclat and BigFIM were considered for the experiment comparison. Experiments show that the proposed method has superior cost-effectiveness for processing massive datasets and good capabilities under various experiment conditions. The proposed method on average required only 33% of the execution time and 45% of the transmission cost of DistEclat. Compared to BigFIM, The proposed method on average required 23.3% of the execution time and 14.2% of the transmission cost of BigFIM.

DOM-Based Algorithm of Mining Frequent Patterns from XML Data

Research of frequent pattern mining from XML data based on heterogeneous XML schema

Bottom-up Discovery of Frequent Rooted Unordered Subtrees

Mining frequent association tag sequences for clustering XML documents

XML Frequent Pattern Tree Mining Based on Recursive Right Path Extending

Efficient Mining of Frequent Closed XML Query Pattern.

XML Documents Cluster Research Based on Frequent Subpatterns

Mop: an Efficient Algorithm for Mining Frequent Pattern with Subtree Traversing

Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining

Efficient mining of frequent closed XML query pattern

Mining Frequent Rooted Ordered Tree Generators Efficiently

Discovery of Frequent Query Patterns in XML Pattern Graph with DTD Cardinality Constraints

Extract Frequent Pattern from Simple Graph Data.

Efficient Frequent Pattern Mining in Relational Databases.

Mining Frequent Patterns with the Pattern Tree.

Discovering Frequent Subtrees from XML Data Using Neural Networks

An intelligence strategy of refactoring XML structure based on XFP-tree

A Distributed Method for Fast Mining Frequent Patterns From Big Data

Tree model guided candidate generation for mining frequent subtrees from XML documents

A Mining Algorithm for Frequent Patterns Based on Prefix Tree

APT-Structure: Efficient Mining of Frequent Patterns