NIA2: A fast indirect association mining algorithm

倪旻,徐晓飞,邓胜春,问晓先
2005-01-01
Abstract:Indirect association is a high level relationship between items and frequent item sets in data. There are many potential applications for indirect associations, such as database marketing, intelligent data analysis, web-log analysis, recommended system, etc. Existing indirect association mining algorithms are mostly based on the notion of post-processing of discovery of frequent item sets. In the mining process, all frequent item sets need to be generated first, and then they are filtered and joined to form indirect associations. We have presented an indirect association mining algorithm (NIA) based on anti-monotonicity of indirect associations whereas k candidate indirect associations can be generated directly from k-1 candidate indirect associations, without all frequent item sets generated. We also use the frequent itempair support matrix to reduce the time and memory space needed by the algorithm. In this paper, a novel algorithm (NIA2) is introduced based on the generation of indirect association patterns between itempairs through one item mediator sets from frequent itempair support matrix. A notion of mediator set support threshold is also presented. NIA2 mines indirect association patterns directly from the dataset, without generating all frequent item sets. The frequent itempair support matrix and the notion of using t_m as the support threshold for mediator sets can significantly reduce the cost of joint operations and the search process compared with existing algorithms. Results of experiments on a real-word web log dataset have proved NIA2 one order of magnitude faster than existing algorithms.
What problem does this paper attempt to address?