Targeted Mining Precise-positioning Episode Rules

Jian Zhu,Xiaoye Chen,Wensheng Gan,Zefeng Chen,Philip S. Yu
2024-06-08
Abstract:The era characterized by an exponential increase in data has led to the widespread adoption of data intelligence as a crucial task. Within the field of data mining, frequent episode mining has emerged as an effective tool for extracting valuable and essential information from event sequences. Various algorithms have been developed to discover frequent episodes and subsequently derive episode rules using the frequency function and anti-monotonicity principles. However, currently, there is a lack of algorithms specifically designed for mining episode rules that encompass user-specified query episodes. To address this challenge and enable the mining of target episode rules, we introduce the definition of targeted precise-positioning episode rules and formulate the problem of targeted mining precise-positioning episode rules. Most importantly, we develop an algorithm called Targeted Mining Precision Episode Rules (TaMIPER) to address the problem and optimize it using four proposed strategies, leading to significant reductions in both time and space resource requirements. As a result, TaMIPER offers high accuracy and efficiency in mining episode rules of user interest and holds promising potential for prediction tasks in various domains, such as weather observation, network intrusion, and e-commerce. Experimental results on six real datasets demonstrate the exceptional performance of TaMIPER.
Databases
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that in Frequent Episode Mining (FEM), there is a lack of target rule mining algorithms for user - specified query event sequences. Specifically, although existing FEM algorithms can discover frequently occurring event sequences and their rules, these rules usually do not contain event combinations of specific interest to users. Therefore, the paper proposes a new concept - Targeted Mining Precision Episode Rules (TaMIPER), aiming to efficiently mine target rules that contain user - specified query events, and these rules have precise time - location characteristics. ### Main contributions of the paper: 1. **Introduced the concept of Targeted Mining Precision Episode Rules**: Defined Query Episodes and Target Episode Rules, and formalized the problem of Targeted Mining Precision Episode Rules mining. 2. **Developed an efficient TaMIPER algorithm**: This algorithm can discover complete, accurate event rules that meet user goals. 3. **Proposed four pruning strategies**: These strategies significantly reduce the requirements for time and space resources and improve the efficiency of the algorithm. 4. **Conducted extensive experimental verification**: Experiments were carried out using six real - world data sets, and the results show that TaMIPER outperforms existing benchmark methods in performance. ### Background and motivation: - **Frequent Event Sequence Mining (FEM)**: FEM is a task of identifying frequent event sequences from a single time series, and is widely used in fields such as traffic data, network logs, and financial data. - **Episode Rules**: Episode Rules represent the relationship between two event sequences, for example, "after event A occurs, event B will occur within a certain time". Traditional FEM algorithms can discover all frequent episode rules, but these rules may contain a large amount of irrelevant information. - **The need for target rule mining**: In some application scenarios, users may be only interested in specific event combinations. For example, in the financial field, users may want to know the change in stock prices after specific events occur. Traditional methods may generate a large number of irrelevant rules, leading to a waste of resources. ### Solutions: - **Define target rules**: The paper defines Targeted Precision Episode Rules (TaPER). Such rules not only contain user - specified query events but also have precise time - location characteristics. - **Algorithm design**: The TaMIPER algorithm is based on a tree structure and is divided into two main stages: - **First stage**: Mine the superset of query events to provide location information for potential target rules. - **Second stage**: Extract frequent Minimal Event Occurrences (MEO) and use these MEOs as antecedents to efficiently derive TaPER. - **Pruning strategies**: The paper proposes four pruning strategies, which significantly improve the efficiency of the algorithm by reducing the unnecessary search space. ### Experimental results: - **Superior performance**: The experimental results show that TaMIPER outperforms existing benchmark methods on multiple real - world data sets, especially when dealing with large - scale data. - **Application prospects**: TaMIPER has broad application potential in multiple fields such as weather observation, network security, and e - commerce, and can provide more accurate prediction and decision - support. In conclusion, through proposing the concept of Targeted Mining Precision Episode Rules and the efficient TaMIPER algorithm, this paper solves the deficiencies of existing FEM algorithms in target rule mining, and provides new tools and methods for research and application in related fields.