Abstract:Frequent pattern mining is an important data mining problem with many broad applications. Most studies in this field use support (frequency) to measure the popularity of a pattern, namely the fraction of transactions or sequences that include the pattern in a data set. In this study, we introduce a new interesting measure, namely occupancy, to measure the completeness of a pattern in its supporting transactions or sequences. This is motivated by some real-world pattern recommendation applications in which an interesting pattern should not only be frequent, but also occupies a large portion of its supporting transactions or sequences. With the definition of occupancy we call a pattern dominant if its occupancy value is above a user-specified threshold. Then, our task is to identify the qualified patterns which are both dominant and frequent. Also, we formulate the problem of mining top-k qualified patterns , that is, finding k qualified patterns with maximum values on a user-defined function of support and occupancy, for example, weighted sum of support and occupancy. The challenge to these tasks is that the value of occupancy does not change monotonically when more items are appended to a given pattern. Therefore, we propose a general algorithm called DOFRA (DOminant and FRequent pattern mining Algorithm) for mining these qualified patterns, which explores the upper bound properties on occupancy to drastically reduce the search process. Finally, we show the effectiveness of DOFRA in two real-world applications and also demonstrate the efficiency of DOFRA on several real and large synthetic datasets.

Mining Top-Rank-K Frequent Patterns

Mining Associated and Item-Item Correlated Frequent Patterns

Mining Top-k Minimal Redundancy Frequent Patterns over Uncertain Databases.

Near-optimal Top-k Pattern Mining

Mining Noise-Tolerant Frequent Closed Itemsets in Very Large Database.

Fast Mining Top-Rank-k Frequent Patterns by Using Node-lists.

A cost-effective approach for mining near-optimal top- k patterns

A Two-Phase Approach for Unexpected Pattern Mining.

VTK: Vertical Mining of Top-Rank-K Frequent Patterns

Mining Top-K Frequent Closed Patterns Without Minimum Support

Mining Frequent Ordered Patterns

Occupancy-Based Frequent Pattern Mining*

Top-Down Mining of Frequent Closed Patterns from Very High Dimensional Data

Towards Efficient Re-mining of Frequent Patterns Upon Threshold Changes

A Distributed Method for Fast Mining Frequent Patterns From Big Data

Efficiently Mining Maximal Frequent Mutually Associated Patterns.

JPMiner: Mining Frequent Jump Patterns from Graph Databases.

An Algorithm of Mining Frequent Itemsets Based on Bloom Filter

Mining Top-K Co-Occurrence Items

Extract Frequent Pattern from Simple Graph Data.

Mining Frequent Closed Patterns by Adaptive Pruning