PIQUE: Progressive Integrated QUery Operator with Pay-As-You-Go Enrichment

Dhrubajyoti Ghosh,Roberto Yus,Yasser Altowim,Sharad Mehrotra
DOI: https://doi.org/10.48550/arXiv.1805.12033
2019-10-18
Abstract:Big data today in the form of text, images, video, and sensor data needs to be enriched (i.e., annotated with tags) prior to be effectively queried or analyzed. Data enrichment (that, depending upon the application could be compiled code, declarative queries, or expensive machine learning and/or signal processing techniques) often cannot be performed in its entirety as a pre-processing step at the time of data ingestion. Enriching data as a separate offline step after ingestion makes it unavailable for analysis during the period between the ingestion and enrichment. To bridge such a gap, this paper explores a novel approach that supports progressive data enrichment during query processing in order to support interactive exploratory analysis. Our approach is based on integrating an operator, entitled PIQUE, to support a prioritized execution of the enrichment functions during query processing. Query processing with the PIQUE operator significantly outperforms the baselines in terms of rate at which answer quality improves during query processing.
Databases
What problem does this paper attempt to address?