Olaping stream data: Is it feasible

Yixin Chen,Guozhu Dong,Jiawei Han,Jian Pei,Benjamin W Wah,Jianyong Wang
2002-01-01
Abstract:Real-time surveillance systems and other dynamic environments often generate tremendous (potentially infinite) volume of stream data: the volume is too huge to be scanned multiple times. However, much of such data resides at rather low level of abstraction, whereas most analysts are interested in dynamic changes (such as trends and outliers) at relatively high levels of abstraction. To discover such high level characteristics, one may need to perform on-line multi-level analysis of stream data, similar to OLAP (on-line analytical processing) of relational or data warehouse data. With limited storage space and the demand for fast response, is it realistic to promote on-line, multi-dimensional analysis and mining of stream data to alert people about dramatic changes of situations at multiple-levels of abstraction? In this paper, we present an architecture, called stream cube, which, based on our analysis, is feasible for successful online, multi-dimensional, multi-level analysis of stream data. By successful, we mean that the system will provide analytical power and flexibility, derive timely and quality responses, and consume limited memory space and other resources. The general design of the stream cube architecture is described as follows. First, a tilt time frame model is taken as the default model for time dimension. Such a model reduces the amount of data to be retained in memory or stored on disks but still achieves flexibility and analysis power. Second, a small number of critical layers are maintained for flexible analysis. Consider that the stream data resides at the primitive layer. It is desirable to identify two critical higher layers in applications: the …
What problem does this paper attempt to address?