Mining Top-K constrained cross-level high-utility itemsets over data streams
Meng Han,Shujuan Liu,Zhihui Gao,Dongliang Mu,Ang Li
DOI: https://doi.org/10.1007/s10115-023-02045-8
IF: 2.7
2024-01-21
Knowledge and Information Systems
Abstract:Cross-Level High-Utility Itemsets Mining (CLHUIM) aims to discover interesting relationships between hierarchy levels by introducing the taxonomy of items. To tackle this issue of the current CLHUIM algorithms encountering a challenge in dealing with large search spaces, researchers have proposed the concept of mining Top-K cross-level high-utility itemsets(CLHUIs). However, the results obtained by these methods often contain redundant itemsets with significant differences in hierarchy levels, and a large proportion of itemsets with higher abstraction levels, making it neglect some detailed information and unable to provide information of itemsets within the specified hierarchy range. Additionally, they are unable to handle dynamic transactional data. To address the aforementioned problems, this paper proposes Top-K Constrained Cross-Level High-Utility Itemsets Mining (TKCCLHM) algorithm to efficiently mine Top-K itemsets across different hierarchy levels over data streams. Firstly, a new hierarchical level concept is introduced to control the abstraction level of the introduced items, and Top-K itemsets are mined within a specific hierarchy range based on this concept. Secondly, a sliding window-based data structure called Sliding Window-based Utility Projection List (SUPL) is designed, which combined with transaction projection techniques to mine CLHUIs efficiently. Lastly, a Batch and Utility Hash Table (BUHT) structure capable of storing batch and (generalized) item utility information is proposed, along with a new threshold raising strategy. Extensive experiments on six datasets with taxonomy information demonstrated that the proposed algorithm exhibited significant improvements in runtime and scalability performance compared to the state-of-the-art algorithms.
computer science, information systems, artificial intelligence