Efficient mining of concept-hierarchy aware distinguishing sequential patterns

Chengxin He,Lei Duan,Guozhu Dong,Jyrki Nummenmaa,Tingting Wang,Tinghai Pang
DOI: https://doi.org/10.1016/j.knosys.2022.109710
2022-11-14
Abstract:Distinguishing sequential patterns are sequential patterns that have much higher frequencies in one target group of sequences (concerning a given phenomenon of interest) than in a contrasting group of sequences. Distinguishing sequential patterns are useful for many machine learning tasks, as well as for the explanation and characterization of the phenomenon underlying the target group of sequences. However, previous studies on mining distinguishing sequential patterns did not consider the hierarchical relationship among elements in sequences. To fill the gap, this paper investigates the mining of distinguishing sequential patterns in the presence of concept hierarchies among sequence elements. The associated patterns will be called concept-hierarchy aware distinguishing sequential patterns (hDSPs). After presenting the challenges on mining hDSPs, we present hDSP-Miner, a method with effective pruning techniques, for mining hDSPs. Our empirical study using real-world protein sequences demonstrates that hDSP-Miner is effective and efficient, and it can discover more novel distinguishing sequential patterns than previous algorithms for mining distinguishing sequential patterns.
computer science, artificial intelligence
What problem does this paper attempt to address?