Abstract:The similarity search on sensor data generated by a myriad of sensing devices is a frequently encountered problem in the era of the Internet of Things (IoT). This sensor data generally appear in the form of time series, a temporally ordered sequence of real numbers obtained regularly in time. It has been widely accepted that the dynamic time warping (DTW) currently is the most prevalent similarity measure in the time-series mining community, mainly due to its flexibility and broad applicability. However, calculating DTW between two time series has quadratic time complexity, leading to unsatisfactory efficiency when performing the similarity search over the large time-series data set. The main contribution of this article is to propose a method called product quantization (PQ)-based DTW (PQDTW) for fast time-series approximate similarity search under DTW. The PQ, a well-known approximate nearest neighbor search approach, is used in PQDTW. Nevertheless, the conventional PQ is developed with the Euclidean distance and is not designed for DTW. To solve this problem, the DTW barycenter averaging (DBA) technique is utilized to adapt the PQ for DTW before using it. We employ PQDTW along with the filter-and-refine framework to efficiently and accurately perform the time-series similarity search. Our method can reasonably reduce many DTW computations in the filtering phase; thus, the query process is accelerated. We compare PQDTW with related popular algorithms using public time-series data sets. Experimental results verify that the proposal achieves the best tradeoff between query efficiency and retrieval accuracy compared to the competitors.

Efficient Algorithm for a Novel Pattern of Time Series

An efficient method for time series similarity search using binary code representation and hamming distance

Dynamic Time Warping under Product Quantization, with Applications to Time-Series Data Similarity Search

A Novel Similarity Measure Approach for Time Series Based on PLA and DTW

Variable Step Algorithm for Sub-Trend Sequence Searching

Continually Evaluating Similarity-Based Pattern Queries on a Streaming Time Series

Segmental semi-markov model based online series pattern detection under arbitrary time scaling

Spatial-Temporal Congestion Identification Based on Time Series Similarity Considering Missing Data

Fast Online Similarity Search for Uncertain Time Series

Subsequence Similarity Search under Time Shifting

Interest-Based Queries For Time Series Data

Efficient Time Series Clustering And Its Application To Social Network Mining

Similarity-Based Queries for Time Series Data

Online Series Pattern Detection Based on Advanced Segmental Semi-Markov Model

A Time Series Similar Pattern Matching Algorithm Based on Singularity Event Features

FastOPM - a Practical Method for Partial Match of Time Series

A New Method for Similarity Matching of Non-Stationary Time Series Based on Fractal Time-Varying Dimension

Micro Similarity Queries in Time Series Database

Towards a faster symbolic aggregate approximation method

An Angle-Based Dissimilarity for Accelerating the Clustering of Dynamic Data in Networks

Indexable Online Time Series Segmentation with Error Bound Guarantee