Error-adaptive and time-aware maintenance of frequency counts over data streams

Hongyan Liu,Ying Lu,Jiawei Han,Jun He
DOI: https://doi.org/10.1007/11775300_41
2006-01-01
Abstract:Maintaining frequency counts for items over data stream has a wide range of applications such as web advertisement fraud detection. Study of this problem has attracted great attention from both researchers and practitioners. Many algorithms have been proposed. In this paper, we propose a new method, error-adaptive pruning method, to maintain frequency more accurately. We also propose a method called fractionization to record time information together with the frequency information. Using these two methods, we design three algorithms for finding frequent items and top-k frequent items. Experimental results show these methods are effective in terms of improving the maintenance accuracy.
What problem does this paper attempt to address?