Parallelizing Probabilistic Streaming Skyline Operator in Cloud Computing Environments

Xiaoyong Li,Yijie Wang,Xiaoling Li,Yuan Wang,Rubing Huang
DOI: https://doi.org/10.1109/COMPSAC.2013.15
2013-01-01
Abstract:The skyline query processing over uncertain data streams has received considerable attention, due to its importance in helping users make intelligent decisions over complex data. Nevertheless, existing studies only focus on retrieving the skylines over data streams in a centralized environment typically with one processor, which limits the scalability of algorithms and cannot meet the requirement for massive data analysis. The emerging cloud computing environment provides much more reliable and stable environments than the traditional distributed environments, which can be well adapted to the massive data management and complex queries. Unfortunately, existing parallel frameworks in clouds such as MapReduce and its variants are not suitable for the skyline queries over uncertain data streams. In this paper, we propose a general framework for parallelizing the probabilistic streaming skyline operator with the sliding window partitioning. Particularly, we propose four items mapping strategies CMS, AMS, DMS and APS to optimize the queries based on the proposed parallel framework. Extensive experiments with real deployment are conducted to demonstrate the effectiveness and efficiency of the proposals.
What problem does this paper attempt to address?