Finding Frequent Items in Time Decayed Data Streams.

Shanshan Wu,Huaizhong Lin,Leong Hou U,Yunjun Gao,Dongming Lu
DOI: https://doi.org/10.1007/978-3-319-45817-5_2
2016-01-01
Abstract:Identifying frequently occurring items is a basic building block in many data stream applications. A great deal of work for efficiently identifying frequent items has been studied on the landmark and sliding window models. In this work, we revisit this problem on a new streaming model based on time decay, where the importance of every arrival item is decreased over the time. To address the importance changes over the time, we propose a new heap structure, named Quasi-heap, which maintains the item order using a lazy update mechanism. Two approximation algorithms, Space Saving with Quasi-heap (SSQ) and Filtered Space Saving with Quasi-heap (FSSQ), are proposed to find the frequently occurring items based on the Quasi-heap structure. Extensive experiments demonstrate the superiority of proposed algorithms in terms of both efficiency (i.e., response time) and effectiveness (i.e., accuracy).
What problem does this paper attempt to address?