Optimizing Time Series Queries with Versions

Rui Kang,Shaoxu Song
DOI: https://doi.org/10.1145/3654962
2024-01-01
Abstract:We show that the time-series database for industrial IoT data management exhibits intrinsic demands for integrating an automatic version control system, which introduces advanced data semantics and query optimization. In deployed IoT database instances, IoT data managed by an LSM tree is multi-leveled and multi-versioned due to network issues and erroneous IoT readings. For data semantics, each query merges versioned data according to query expressions or data block levels. For query optimization, we find that existing time-series databases relying on write-ahead-logs suboptimally execute data queries, due to their performance bottlenecks in merging numerous versioned data. In this paper, an algebra consisting of version operators addresses the semantics for time-series applications to evaluate and optimize physical query plans. We propose version reducibility as a key feature of executing consistent plans and evaluate the benefits of putting off data merges. We also show the integration of version queries to existing relational databases by translating them to standard SQL based on relational reducibility. Finally, our extended experiments show the effectiveness of optimizing execution plans over versioned data.
What problem does this paper attempt to address?