Backward-Sort for Time Series in Apache IoTDB.

Xiaojian Zhang,Hongyin Zhang,Shaoxu Song,Xiangdong Huang,Chen Wang,Jianmin Wang
DOI: https://doi.org/10.1109/icde55515.2023.00245
2023-01-01
Abstract:While time series data are naturally ordered by timestamps for efficient storage and query processing, the data points in a time series often come out-of-order. We identify two unique features of out-of-order arrivals in Apache IoTDB, i.e., delay-only and not-too-distant. It is not surprising that data points can only be delayed but should never come "earlier" before the generation of its succeeding ones. Moreover, the system employs a separation policy to handle those points delayed for a very long period, and thus only sorts data points delayed to not-too-distant future. Motivated by such unique features, we devise a new algorithm for sorting time series data, Backward-Sort. Intuitively, the delay-only feature leads to the strategy of moving points backward in sorting. Moreover, the not-too-distant feature results in blocks of data points, such that moving points are expected to occur locally inside the blocks. To our best knowledge, this is the first sorting algorithm specially designed for out-of-order arrivals in time series. The algorithm becomes a fundamental component of sorting time series data in Apache IoTDB. The evaluation is conducted over real and synthetic datasets, using IoTDB-benchmark.
What problem does this paper attempt to address?