Search on Secondary Attributes in Geo-Distributed Systems

Dimitrios Vasilas
DOI: https://doi.org/10.48550/arXiv.1801.02974
2018-01-09
Abstract:In the age of big data, more and more applications need to query and analyse large volumes of continuously updated data in real-time. In response, cloud-scale storage systems can extend their interface that allows fast lookups on the primary key with the ability to retrieve data based on non-primary attributes. However, the need to ingest content rapidly and make it searchable immediately while supporting low-latency, high-throughput query evaluation, as well as the geo-distributed nature and weak consistency guarantees of modern storage systems pose several challenges to the implementation of indexing and search systems. We present our early-stage work on the design and implementation of an indexing and query processing system that enables realtime queries on secondary attributes of data stored in geo-distributed, weakly consistent storage systems.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?