A Nettree for Approximate Maximal Pattern Matching with Gaps and One-Off Constraint

Youxi Wu,Xindong Wu,He Jiang,Fan Min
DOI: https://doi.org/10.1109/ictai.2010.81
2010-01-01
Abstract:Recently, pattern matching with flexible gap constraints has attracted extensive attention especially in biological sequence analysis and mining patterns from sequences. An issue is to search Maximal Pattern Matching with Gaps and the One-Off Condition (MPMGOOC). Firstly, we introduce the concept of MPMGOOC. In order to solve the problem, we propose some special concepts of Nettree which is different from a tree in that a node may have more than one parent. Based on Nettree, an algorithm named Heuristic Search Occurrence (HSO) is proposed. The space and time complexities of the algorithm are O(W * m * n) and O(W * n *(n + m * m)) respectively, where m, n, and W are the length of pattern P, sequence S and the maximal gap respectively. The comparison results show that HSO achieves better performance than a state-of-the-art algorithm in most cases of the real-world biological data testing.
What problem does this paper attempt to address?