Density Estimation Over Data Stream ∗

Aoying Zhou,Zhiyuan Cai,Li Wei
2002-01-01
Abstract:Density estimation is an important but costly operation for applications that need to know the distribution of a data set. Moreover, when the data comes as a stream, traditional density estimation methods cannot cope with it efficiently. In this paper, we examined the problem of computing density function over data streams and developed a novel method to solve it. A new concept M-Kernel is used in our algorithm, and it is of the following characteristics: (1) the running time is in linear with the data size, (2) it can keep the whole computing in limited size of memory, (3) its accuracy is comparable to the traditional methods, (4) a useable density model could be available at any time during the processing, (5) it is flexible and can suit with different stream models. Analytical and experimental results showed the efficiency of the proposed algorithm.
What problem does this paper attempt to address?