USER: Towards High-Utility Sequential Rules with Repetitive Items.

Hong Lin,Wensheng Gan,Gengsen Huang,Philip S. Yu
DOI: https://doi.org/10.1109/BigData59044.2023.10386473
2023-01-01
Abstract:Discovering interesting sequential rules in the sequence database is quite important for a variety of fields, ranging from customer behavior analysis to intrusion detection. High utility sequential rule mining (HUSRM) was proposed to obtain more informative rules. Its goal is to find those sequential rules with high utility values and high confidence, i.e., HUSRs. As far as we know, a few algorithms are proposed to discover HUSRs. However, these algorithms do not fully consider the existence of repetitive items in the sequences of the database. In this paper, we propose an algorithm named USER to discover HUSRs in multi-sequences with the existence of repetitive items. A data structure called an occurrence information (OI)-list is designed to distinguish the different occurrences of items in a sequence. Moreover, the change in the upper bound value after the rule expansion is discussed in detail, which is complicated by the repetitive items. We also introduce two pruning strategies (ROOR and REIO-I) to optimize mining efficiency when there are too many repetitive items in the sequence. Finally, we conduct experiments on several datasets, and the results show that USER is able to discover HUSRs with more accurate utility values in an acceptable amount of time and memory consumption.
What problem does this paper attempt to address?