Abstract:Frequent pattern mining is an important data mining problem with many broad applications. Most studies in this field use support (frequency) to measure the popularity of a pattern, namely the fraction of transactions or sequences that include the pattern in a data set. In this study, we introduce a new interesting measure, namely occupancy, to measure the completeness of a pattern in its supporting transactions or sequences. This is motivated by some real-world pattern recommendation applications in which an interesting pattern should not only be frequent, but also occupies a large portion of its supporting transactions or sequences. With the definition of occupancy we call a pattern dominant if its occupancy value is above a user-specified threshold. Then, our task is to identify the qualified patterns which are both dominant and frequent. Also, we formulate the problem of mining top-k qualified patterns , that is, finding k qualified patterns with maximum values on a user-defined function of support and occupancy, for example, weighted sum of support and occupancy. The challenge to these tasks is that the value of occupancy does not change monotonically when more items are appended to a given pattern. Therefore, we propose a general algorithm called DOFRA (DOminant and FRequent pattern mining Algorithm) for mining these qualified patterns, which explores the upper bound properties on occupancy to drastically reduce the search process. Finally, we show the effectiveness of DOFRA in two real-world applications and also demonstrate the efficiency of DOFRA on several real and large synthetic datasets.

Mop: an Efficient Algorithm for Mining Frequent Pattern with Subtree Traversing

Efficient Pattern-Growth Methods for Frequent Tree Pattern Mining

Chopper: Efficient Algorithm for Tree Mining

Mining Frequent Closed Patterns by Adaptive Pruning

JPMiner: Mining Frequent Jump Patterns from Graph Databases.

A Pattern Growth Algorithm for Frequent Patterns Mining

Extract Frequent Pattern from Simple Graph Data.

Mining Frequent Ordered Patterns without Candidate Generation

A New Fast Vertical Method for Mining Frequent Patterns

ESPM - An algorithm to mine frequent subtrees

Co-occurrence order-preserving pattern mining

Efficient Incremental Maintenance of Frequent Patterns with FP-tree

Mining Top-Rank-K Frequent Patterns

Occupancy-Based Frequent Pattern Mining*

Mining Frequent Induced Subtrees by Prefix-Tree-Projected Pattern Growth

A Distributed Method for Fast Mining Frequent Patterns From Big Data

A Pattern Growth Method Based on Memory Indexing for Frequent Patterns Mining

Bottom-up Discovery of Frequent Rooted Unordered Subtrees

OPR-Miner: Order-preserving rule mining for time series

A Two-Phase Approach for Unexpected Pattern Mining.

A cost-effective approach for mining near-optimal top- k patterns