Time series clustering in linear time complexity

Xiaosheng Li,Jessica Lin,Liang Zhao
DOI: https://doi.org/10.1007/s10618-021-00798-w
IF: 5.406
2021-09-18
Data Mining and Knowledge Discovery
Abstract:With the increasing power of data storage and advances in data generation and collection technologies, large volumes of time series data become available and the content is changing rapidly. This requires data mining methods to have low time complexity to handle the huge and fast-changing data. This article presents a novel time series clustering algorithm that has linear time complexity. The proposed algorithm partitions the data by checking some randomly selected symbolic patterns in the time series. We provide theoretical analysis to show that group structures in the data can be revealed from this process. We evaluate the proposed algorithm extensively on all 128 datasets from the well-known UCR time series archive, and compare with the state-of-the-art approaches with statistical analysis. The results show that the proposed method achieves better accuracy compared with other rival methods. We also conduct experiments to explore how the parameters and configuration of the algorithm can affect the final clustering results.
computer science, information systems, artificial intelligence
What problem does this paper attempt to address?