DTW-C++: Fast dynamic time warping and clustering of time series data

David A. Howey,Volkan Kumtepeli,Rebecca Perriment
DOI: https://doi.org/10.21105/joss.06881
2024-09-06
Abstract:Time-series data analysis is of interest in a huge number of different applications, from finding patterns of energy consumption to detecting brain activity or discovering stock price trends. Unsupervised learning methods can help analysts unlock patterns in data, and a key example of this is clustering. However, clustering of time series data can be computationally expensive for large datasets. We present an approach for computationally efficient dynamic time warping (DTW) and clustering of time-series data. The method frames the dynamic warping of time series datasets as an optimisation problem solved using dynamic programming, and then clusters time series data by solving a second optimisation problem using integer programming. There is also an option to use k-medoids clustering when a certificate for global optimality is not essential. The increased speed of our approach is due to task-level parallelisation and memory efficiency improvements. The method was tested using the UCR Time Series Archive, and was found to be on average 33% faster than the next fastest option when using the same clustering approach. This increases to 64% faster when considering only larger datasets (with more than 1000 time series). The integer programming clustering is most effective on small numbers of longer time series, because the DTW computation is faster than other approaches, but the clustering problem becomes increasingly computationally expensive as the number of time series increases.
Computer Science
What problem does this paper attempt to address?