Interactive Constrained Association Rule Mining

Bart Goethals,Jan Van den Bussche
DOI: https://doi.org/10.48550/arXiv.cs/0112011
2003-02-05
Abstract:We investigate ways to support interactive mining sessions, in the setting of association rule mining. In such sessions, users specify conditions (queries) on the associations to be generated. Our approach is a combination of the integration of querying conditions inside the mining phase, and the incremental querying of already generated associations. We present several concrete algorithms and compare their performance.
Databases,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to support users to efficiently generate association rules that meet specific conditions in an interactive mining session during the process of association rule mining. Specifically, the paper explores how to integrate users' query conditions (such as the existence of specific items in the rule body or rule head, support and confidence thresholds, etc.) into the mining algorithm to improve query efficiency and reduce unnecessary calculations. ### Core problems of the paper: 1. **How to efficiently handle user - specified query conditions in association rule mining**: Traditional association rule mining methods usually generate a large number of rules and then screen out the rules that meet the conditions from them. This method is less efficient. The paper proposes a new method that directly integrates query conditions into the mining algorithm, thus reducing unnecessary calculations. 2. **How to reuse the generated association rules in an interactive mining session**: To further improve efficiency, the paper also discusses how to reuse the results of previous queries in one mining session and avoid duplicate calculations. ### Specific problem description: - **Types of constraint conditions**: The constraint conditions considered in the paper include atomic conditions of Boolean combinations, where atomic conditions can be: - Specifying whether a certain item appears in the rule body or rule head (such as \( \text{Body}(i) \) or \( \text{Head}(i) \)). - Setting support or confidence thresholds (such as \( \text{support} \geq 10\% \) or \( \text{confidence} \geq 80\% \)). - **Optimization objectives**: The goal of the paper is to reduce the number of item sets generated but not meeting the query conditions and the number of scans of the database by integrating these constraint conditions into the mining algorithm, thereby improving mining efficiency. ### Solutions: The paper proposes three different methods to handle these query conditions: 1. **Integrated Querying**: Directly integrate constraint conditions into the mining algorithm to ensure that only item sets and rules that meet the conditions are generated. 2. **Post - Processing**: First, perform unconstrained global mining, and then screen out the rules that meet the conditions from the generated rules. 3. **Incremental Querying**: Combine the advantages of the first two methods, gradually generate rules that meet the conditions, and reuse the previous query results. ### Main contributions: - Proposed the first algorithm that can efficiently support interactive mining sessions. - Proved that queries using constraint conditions can significantly improve performance, especially when the constraint conditions are relatively strict. - Discussed how to reuse the generated item sets in one mining session and avoid duplicate calculations. ### Summary: This paper aims to solve the interactive query problem in association rule mining, especially how to efficiently generate rules that meet the conditions when users need to frequently adjust query conditions. By integrating constraint conditions into the mining algorithm and combining the incremental query method, the paper proposes a more efficient solution.