Exploring variable-length time series motifs in one hundred million length scale

Yifeng Gao,Jessica Lin
DOI: https://doi.org/10.1007/s10618-018-0570-1
IF: 5.406
2018-05-10
Data Mining and Knowledge Discovery
Abstract:The exploration of repeated patterns with different lengths, also called variable-length motifs, has received a great amount of attention in recent years. However, existing algorithms to detect variable-length motifs in large-scale time series are very time-consuming. In this paper, we introduce a time- and space-efficient approximate variable-length motif discovery algorithm, Distance-Propagation Sequitur (DP-Sequitur), for detecting variable-length motifs in large-scale time series data (e.g. over one hundred million in length). The discovered motifs can be ranked by different metrics such as frequency or similarity, and can benefit a wide variety of real-world applications. We demonstrate that our approach can discover motifs in time series with over one hundred million points in just minutes, which is significantly faster than the fastest existing algorithm to date. We demonstrate the superiority of our algorithm over the state-of-the-art using several real world time series datasets.
computer science, information systems, artificial intelligence
What problem does this paper attempt to address?