Abstract:The similarity search on sensor data generated by a myriad of sensing devices is a frequently encountered problem in the era of the Internet of Things (IoT). This sensor data generally appear in the form of time series, a temporally ordered sequence of real numbers obtained regularly in time. It has been widely accepted that the dynamic time warping (DTW) currently is the most prevalent similarity measure in the time-series mining community, mainly due to its flexibility and broad applicability. However, calculating DTW between two time series has quadratic time complexity, leading to unsatisfactory efficiency when performing the similarity search over the large time-series data set. The main contribution of this article is to propose a method called product quantization (PQ)-based DTW (PQDTW) for fast time-series approximate similarity search under DTW. The PQ, a well-known approximate nearest neighbor search approach, is used in PQDTW. Nevertheless, the conventional PQ is developed with the Euclidean distance and is not designed for DTW. To solve this problem, the DTW barycenter averaging (DBA) technique is utilized to adapt the PQ for DTW before using it. We employ PQDTW along with the filter-and-refine framework to efficiently and accurately perform the time-series similarity search. Our method can reasonably reduce many DTW computations in the filtering phase; thus, the query process is accelerated. We compare PQDTW with related popular algorithms using public time-series data sets. Experimental results verify that the proposal achieves the best tradeoff between query efficiency and retrieval accuracy compared to the competitors.

Fast Similarity Matching on Data Stream with Noise.

Dynamic Time Warping under Product Quantization, with Applications to Time-Series Data Similarity Search

Similarity Match over High Speed Time-Series Streams

Estimating Similarity over Data Streams Based on Dynamic Time Warping

A Novel Similarity Measure Approach for Time Series Based on PLA and DTW

Research on the Fast Top-K Subsequence Matching Algorithm over Massive Data Streams

A Novel Similarity Search Approach for Streaming Time Series

Algorithm Based on Sliding Window for Similarity Queries over Data Stream

Efficient Similarity Searching Approach For Streaming Time Series

Speed Up Similarity Search of Time Series under Dynamic Time Warping

Similarity Search in Data Stream with Adaptive Segmental Approximations

Continually Evaluating Similarity-Based Pattern Queries on a Streaming Time Series

Segment-based similar time series search

Approximate Similarity Search Over Multiple Stream Time Series

Feature-Based Online Representation Algorithm for Streaming Time Series Similarity Search.

An Acceleration Method For Similar Time-Series Finding

Continuous similarity join on data streams

Fine-grained Pattern Matching Over Streaming Time Series

Similarity Query Processing Algorithm over Data Stream Based on LCSS

Accelerating Time Series Similarity Search under Move-Split-Merge Distance Via Dissimilarity Space Embedding

Efficient Principal Subspace Projection of Streaming Data Through Fast Similarity Matching