Supporting Subseries Nearest Neighbor Search Via Approximation

Changzhou Wang,Xiaoyang Sean Wang
DOI: https://doi.org/10.1145/354756.354834
2000-01-01
Abstract:Searc hingfor nearest neigh b orsin a large set of time series is an importan tdata mining task. This paper studies the following type of time series nearest neighbor queries: Given a query series and a starting time, among all the subseries (of a collection of data series) that have the same length as the query series and start at the given time, nd the K subseries that are closest to the query series. T o support such queries, the paper develops a tec hnique that uses a xed number of values to approximate each whole data series, and obtains the appro ximationof an y required subseries at the query time. The paper then proposes three subseries search algorithms and compares them with the naive method that sequen tially scans the whole data set, as well as a method adapted from a state-of-art subseries search algorithm. Experiments are conducted on both a real-life data set and a synthetic one. Results show that the proposed methods access only a small portion of the precise data and outperform the others in run time.
What problem does this paper attempt to address?