PFTM: A Frequent Subtrees Mining Algorithm Based on Projection

YANG Pei,ZHENG Qi-Lun,PENG Hong,LI Ying-Ji
DOI: https://doi.org/10.3969/j.issn.1002-137X.2005.02.057
2005-01-01
Computer Science
Abstract:Mining frequents subtrees is very useful in domains such as Web mining,XML data analysis,bioninformat- ics,and so on. A novel algorithm called PFTM to discover all frequent subtrees in a forest based on projection is pre- sented. The process partitions both the projected database and the set of candidate nodes. No candidate subtree need to be generated by PFTM. An novel method of recursive updating the set of candidate nodes is applied to greatly reduce the search space. We contrast PFTM with FREQT. We conduct detailed experiments on synthetic datasets to test the performance and scalability of these two methods. Experiments show that PFTM outperforms the FREQT by an aver- age of 25 percent.
What problem does this paper attempt to address?