Efficient algorithms to mine concise representations of frequent high utility occupancy patterns

Hai Duong,Huy Pham,Tin Truong,Philippe Fournier-Viger
DOI: https://doi.org/10.1007/s10489-024-05296-2
IF: 5.3
2024-03-20
Applied Intelligence
Abstract:Identifying all frequent high utility occupancy itemsets (FHUOIs) in a quantitative transaction dataset is a new trend in data mining. By combining both factors of frequency and utility occupancy, these patterns are more suitable for several applications in the real world. These patterns not only reflect the interests of most users but also contribute a high proportion of the utility in supporting transactions. Nonetheless, the set of all discovered FHUOIs may be very large, especially for large and dense datasets or for low values of predefined minimum thresholds. For this reason, it is often quite challenging for users to analyze and use the obtained patterns. To address this issue, this paper proposes two novel algorithms named MaxCloFHUOIM and CloFHUOIM to extract compact representations of FHUOIs. The former is designed to simultaneously mine two concise representations of FHUOIs that consist of all closed FHUOIs and all maximal FHUOIs, whereas the latter only discovers the closed FHUOIs, which provide a lossless summary of all FHUOIs. The proposed algorithms rely on a novel weak upper bound on utility occupancy, to reduce the search space by quickly pruning itemsets with low utility occupancy . Especially, the algorithms integrate two new efficient strategies to prune non- closed FHUOI candidate branches early in the prefix search tree. Results from an in-depth experimental evaluation conducted on several benchmark real-life and synthetic quantitative datasets demonstrate that MaxCloFHUOIM and CloFHUOIM have excellent performance in terms of runtime, memory usage, and scalability. In particular, the proposed algorithms are up to two orders of magnitude faster than a baseline algorithm.
computer science, artificial intelligence
What problem does this paper attempt to address?