Decomposition of Association Rules Mining Process and Analysis of the Intermediate Results

Zhang Yan,Luoming Meng,Feng Liu,Honghui Li,Peter Brezány
2011-01-01
Abstract:Association rules mining is a method for discovering interesting relations between variables in large databases. In data mining systems/platforms, association rules mining process is always treated as a black-box and it only focuses on the function of the whole algorithm, so what happens during the mining process is totally invisible to users and applications, which makes it difficult to analyze the intermediate result and predicate the execution cost before the execution. The work presented in this paper elaborates on decomposing the typical association rules mining algorithm - Apriori as an execution process which is composed of finer-grained data mining operators, and then based on the decomposition, the size of intermediate result of every data mining operator is estimated, which make it possible to estimate the performance of Apriori execution process precisely and the customers can have a clue about the intermediate results and the execution cost of Apriori algorithm. The size of intermediate results and the execution time of every data mining operator are evaluated via OGSA-DAI-based implementations, which show the feasibility of the proposed approach as well as the possibility to estimate Apriori execution process beforehand.
What problem does this paper attempt to address?