On Convolutional Autoencoders to Speed Up Similarity-Based Time Series Mining

Yuri Gabriel Aragão da Silva,Diego Furtado Silva,Yuri Gabriel Aragao da Silva
DOI: https://doi.org/10.1109/bigdata50022.2020.9377999
2020-12-10
Abstract:Time series represent a type of data that is increasingly present in research and industry applications. The most commonly used approach to obtain knowledge from these data is similarity-based data mining algorithms. However, in large volumes of data, applying these algorithms may be infeasible. Therefore, several techniques are proposed in the literature to accelerate the distance calculation between time series, such as algorithms' adaptations, indexing, and approximations. Recent results show that the union of techniques in different categories usually improves efficiency in time series mining tasks by similarity. In this way, we evaluate the dimensionality reduction through convolutional autoencoders to speed up time series distance calculation. To this end, we propose an offline training phase, with data previously observed in other domains, before applying the autoencoders. Our proposal is orthogonal to state-of-the-art tools so that we can use it as a complementary technique. We show that our proposal can lead similarity-based algorithms to execute up to two orders of magnitude faster than these tools alone, without loss of quality in the results obtained through all-pairwise distance calculation, similarity search, and motif discovery.
What problem does this paper attempt to address?