Partitioning, Indexing and Querying Spatial Data on Cloud

Afsin Akdogan
DOI: https://doi.org/10.48550/arXiv.1612.05858
2016-12-18
Abstract:The number of mobile devices (e.g., smartphones, wearable technologies) is rapidly growing. In line with this trend, a massive amount of spatial data is being collected since these devices allow users to geo-tag user-generated content. Clearly, a scalable computing infrastructure is needed to manage such large datasets. Meanwhile, Cloud Computing service providers (e.g., Amazon, Google, and Microsoft) allow users to lease computing resources. However, most of the existing spatial indexing techniques are designed for the centralized paradigm which is limited to the capabilities of a single sever. To address the scalability shortcomings of existing approaches, we provide a study that focus on generating a distributed spatial index structure that not only scales out to multiple servers but also scales up since it fully exploits the multi-core CPUs available on each server using Voronoi diagram as the partitioning and indexing technique which we also use to process spatial queries effectively. More specifically, since the data objects continuously move and issue position updates to the index structure, we collect the latest positions of objects and periodically generate a read-only index to eliminate costly distributed updates. Our approach scales near-linearly in index construction and query processing, and can efficiently construct an index for millions of objects within a few seconds. In addition to scalability and efficiency, we also aim to maximize the server utilization that can support the same workload with less number of servers. Server utilization is a crucial point while using Cloud Computing because users are charged based on the total amount of time they reserve each server, with no consideration of utilization.
Databases,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?