Constrained Frequent Subtree Mining Method

Kun Han,Weifeng Lv,Baocai Yin,Yongli Hu
DOI: https://doi.org/10.1109/ICDH.2014.62
2014-01-01
Abstract:With the semi-structured data rapidly growing, it is crucial to obtain valuable information for different applications. So many data mining methods are proposed and the frequent sub trees mining is an important and typical method. The current mining methods demand substantial computational time and space, and return a huge number of patterns, but some important sub trees are often missed and some patterns are uninteresting to users. In this paper we proposed two novel algorithms, namely FSMDC and FSMIC, for mining frequent embedded sub trees from rooted labeled ordered trees database. In these proposed algorithms, the distance and interest constraint are introduced respectively to achieve expected mining results. The experiments show that the two newly developed algorithms are efficient, scalable and more consistent with purpose of users.
What problem does this paper attempt to address?