Mining Frequent Induced Subtree Patterns with Subtree-Constraint

Lei Zou,Yansheng Lu,Huaming Zhang,Rong Hu
DOI: https://doi.org/10.1109/ICDMW.2006.112
2006-01-01
Abstract:Mining frequent induced subtree patterns is very useful in domains such as XML databases, web log analyzing. However, because of the combinatorial explosion, mining all frequent subtree patterns becomes infeasible for a large and dense tree database. And too many frequent subtree patterns also confuse users. Usually only a small set of the mining results can arouse users' interests. In this paper, we propose a problem to discover frequent induced subtree patterns that are super trees of a given pattern tree specified by users, i.e. frequent induced subtree patterns with subtree-constraint. Most existing frequent subtree mining algorithms are based on right-most extension, which does not work well in the new problem. So free extension is presented to replace right-most extension in this paper. To avoid the duplicate pattern problem caused by free extension, we develop an efficient method that ensures no duplicate patterns in mining process or results. Then Subtree- Constraint Frequent Subtree Patterns Mining Algorithm, i.e.SCFS algorithm, is given. The experiment results also show that our algorithm achieves good performance.
What problem does this paper attempt to address?