Abstract:In this uncertain world, data uncertainty is inherent in many applications and its importance is growing drastically due to the rapid development of modern technologies. Nowadays, researchers have paid more attention to mine patterns in uncertain databases. A few recent works attempt to mine frequent uncertain sequential patterns. Despite their success, they are incompetent to reduce the number of false-positive pattern generation in their mining process and maintain the patterns efficiently. In this paper, we propose multiple theoretically tightened pruning upper bounds that remarkably reduce the mining space. A novel hierarchical structure is introduced to maintain the patterns in a space-efficient way. Afterward, we develop a versatile framework for mining uncertain sequential patterns that can effectively handle weight constraints as well. Besides, with the advent of incremental uncertain databases, existing works are not scalable. There exist several incremental sequential pattern mining algorithms, but they are limited to mine in precise databases. Therefore, we propose a new technique to adapt our framework to mine patterns when the database is incremental. Finally, we conduct extensive experiments on several real-life datasets and show the efficacy of our framework in different applications.

What problem does this paper attempt to address?

This paper attempts to solve several key problems encountered in mining sequential patterns in uncertain databases. Specifically, these problems include: 1. **Reducing the Generation of False - positive Patterns**: Existing methods generate a large number of false - positive patterns (i.e., patterns that do not meet the conditions but are misidentified as frequent) during the mining process, which leads to unnecessary computational overhead and resource waste. 2. **Efficiently Maintaining Candidate Patterns**: Existing methods are less efficient in maintaining candidate patterns, resulting in a high cost for support - degree calculation and affecting the overall performance. 3. **Lack of an Effective Weight Upper Limit**: For the mining of weighted patterns, existing methods lack an upper - limit measure that can effectively handle weights while maintaining the anti - monotonic property. 4. **Scalability Issues of Incremental Databases**: With the development of modern technology, most databases are dynamic and incremental. However, existing uncertain sequential pattern mining algorithms cannot effectively handle this dynamic characteristic, and it is not practical to rerun batch - processing algorithms after each increment. To solve the above problems, the author proposes a new framework, which includes the following improvements: - **Theoretically Tightened Pruning Upper Limits**: Three theoretically stricter upper limits (`expSupcap`, `wgtcap`, `wExpSupcap`) are proposed to significantly reduce the mining space, thereby reducing the generation of false - positives. - **Hierarchical Index Structure**: A novel hierarchical index structure `USeq - Trie` is introduced to maintain patterns more efficiently. - **Fast Support - degree Calculation Method**: A faster method `SupCalc` is developed to calculate the expected support - degree of patterns. - **Efficient Uncertain Sequential Pattern Mining Algorithm**: An efficient algorithm named `FUSP` is proposed for mining sequential patterns in uncertain databases. - **Incremental Mining Method**: For incremental databases, a new technique `InUSP` is proposed, which can effectively mine patterns in the case of database increments and improve the mining efficiency and the integrity of the results by introducing Promising Frequent Sequences (PFS). Through these improvements, this paper aims to provide a more efficient, accurate and applicable method for mining uncertain sequential patterns in incremental databases.

Mining Sequential Patterns in Uncertain Databases Using Hierarchical Index Structure

Mining Weighted Sequential Patterns in Incremental Uncertain Databases

Mining Top-k Minimal Redundancy Frequent Patterns over Uncertain Databases.

Sequential Pattern Mining in Databases with Temporal Uncertainty

Accelerated Frequent Closed Sequential Pattern Mining for Uncertain Data

Towards Efficient Sequential Pattern Mining in Temporal Uncertain Databases

Frequent Pattern Mining with Uncertain Data

Mining Uncertain Sequential Patterns in Iterative MapReduce

Frequent Pattern Mining Algorithms With Uncertain Data

Mining Probabilistically Frequent Sequential Patterns in Large Uncertain Databases

A Two-Phase Approach for Unexpected Pattern Mining.

Towards utility-driven contiguous sequential patterns in uncertain multi-sequences

Sequential Pattern Mining in Multi-Databases Via Multiple Alignment

Indexing And Mining Of The Local Patterns In Sequence Database

Mining Sequential Patterns with Constraints in Large Databases

Efficient Support Coupled Frequent Pattern Mining Over Progressive Databases

New Approach for the Sequential Pattern Mining of High-Dimensional Sequence Databases

Intelligent Sequential Mining Via Alignment: Optimization Techniques For Very Large Db

Imcs: Incremental Mining Of Closed Sequential Patterns

Mining Order-Preserving Submatrices Under Data Uncertainty: A Possible-World Approach and Efficient Approximation Methods

Mining frequent temporal duration-based patterns on time interval sequential database