Comparative Study of Sequential Pattern Mining Frameworks Support Framework vs . Multiple Alignment Framework

Hye-Chung,Kum,Susan Paulsen,Wei Wang
2002-01-01
Abstract:Knowledge discovery and datamining (KDD) is commonly defined as the nontrivial process of finding interesting, novel and useful patterns from data. In this paper, we examine closely the problem of mining sequential patterns and propose a comprehensive evaluation method to assess the quality of the mined results. We propose four evaluation criteria, namely (1) recoverability, (2) the number of spurious patterns (3) the number of redundant patterns, and (4) the degree of extraneous items in the patterns, to quantitatively assess the quality of the mined result from a wide variety of synthetic datasets with varying randomness and noise levels. Recoverability, a new metric, measures how much of the underlying trend has been detected. Such an evaluation method provides a basis for comparing different frameworks for sequential pattern mining, which is very essential in understanding the performance of approximate solutions. In this paper, the method is employed to conduct a detailed comparison of the traditional frequent sequential pattern framework with an alternative approximate pattern framework based on sequence alignment. We demonstrate that the alternative approach is able to best recover the underlying patterns with little confounding information under all circumstances including those where the frequent sequential pattern framework fails.
What problem does this paper attempt to address?