Abstract:Getting a robust time-series clustering with best choice of distance measure and appropriate representation is always a challenge. We propose a novel mechanism to identify the clusters combining learned compact representation of time-series, Auto Encoded Compact Sequence (AECS) and hierarchical clustering approach. Proposed algorithm aims to address the large computing time issue of hierarchical clustering as learned latent representation AECS has a length much less than the original length of time-series and at the same time want to enhance its <a class="link-external link-http" href="http://performance.Our" rel="external noopener nofollow">this http URL</a> algorithm exploits Recurrent Neural Network (RNN) based under complete Sequence to Sequence(seq2seq) autoencoder and agglomerative hierarchical clustering with a choice of best distance measure to recommend the best clustering. Our scheme selects the best distance measure and corresponding clustering for both univariate and multivariate time-series. We have experimented with real-world time-series from UCR and UCI archive taken from diverse application domains like health, smart-city, manufacturing etc. Experimental results show that proposed method not only produce close to benchmark results but also in some cases outperform the benchmark.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: How to achieve robust time - series clustering in time - series analysis while selecting the best distance metric method and appropriate time - series representation. Specifically, the author proposes a novel mechanism that combines the learned compact time - series representation (Auto Encoded Compact Sequence, AECS) and the hierarchical clustering method to solve the problem of long computational time in traditional hierarchical clustering and improve clustering performance. ### Specific description of the problem 1. **Robust time - series clustering**: - Time - series data usually lacks labels, and the knowledge cost of domain experts is high. Therefore, effective unsupervised learning methods are required to discover patterns, groups, and subgroups. - In fields such as medical care, manufacturing, and smart cities, the complexity and diversity of time - series data increase the difficulty of clustering. 2. **Selecting the best distance metric method**: - Different distance metric methods (such as Chebyshev distance, Manhattan distance, Mahalanobis distance, etc.) have different impacts on clustering results, and it is crucial to select an appropriate method. - An internal clustering validation measure is required to evaluate and select the best distance metric method. 3. **Efficient time - series representation**: - Traditional hierarchical clustering methods have high computational overhead when dealing with long sequences, resulting in low efficiency. - A compact time - series representation method is required, which can not only reduce the computational time but also retain the important features of the time - series. ### Solution The solutions proposed by the author include the following aspects: 1. **Auto Encoded Compact Sequence (AECS)**: - Use Seq2Seq LSTM auto - encoder to learn the compact representation of time - series, and the length of the generated latent representation is much shorter than the length of the original time - series. - This compact representation not only reduces the computational time but also captures the key features of the time - series. 2. **Hierarchical clustering**: - Apply the agglomerative hierarchical clustering method to cluster AECS. - Use the average linkage method to calculate the similarity between clusters and form non - convex - shaped clusters to meet the needs of various practical applications. 3. **Distance metric selection**: - Compare three different distance metric methods: Chebyshev distance, Manhattan distance, and Mahalanobis distance. - Use the Modified Hubert Statistic (T) as an internal clustering validation index and select the clustering result with the highest T value. 4. **Extensive experimental verification**: - Conduct experiments on multiple univariate and multivariate time - series datasets in the UCR Time - Series Classification Archive and the UCI Machine Learning Library. - The experimental results show that this method can not only reach the level of the benchmark algorithm but even exceed the benchmark algorithm in some cases. Through these methods, the author effectively solves the problems of computational efficiency and performance improvement in time - series clustering, providing strong support for practical applications.

Hierarchical Clustering using Auto-encoded Compact Representation for Time-series Analysis

End-to-end deep representation learning for time series clustering: a comparative study

A clustering approach to time series forecasting using neural networks: A comparative study on distance-based vs. feature-based clustering methods

Clustering Time Series Utilizing A Dimension Hierarchical Decomposition Approach

Efficient Forecasting of Large Scale Hierarchical Time Series via Multilevel Clustering

Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features

Tree-based Methods for Clustering Time Series Using Domain-Relevant Attributes

Research on load clustering algorithm based on variational autoencoder and hierarchical clustering

Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting

A benchmark study on time series clustering

Deep Spatiotemporal Clustering: A Temporal Clustering Approach for Multi-dimensional Climate Data

Scalable Hierarchical Agglomerative Clustering

Divide-and-Conquer: A Distributed Hierarchical Factor Approach to Modeling Large-Scale Time Series Data

Hierarchical Clustering using Reversible Binary Cellular Automata for High-Dimensional Data

Recurrent Deep Divergence-based Clustering for simultaneous feature learning and clustering of variable length time series

Cluster-and-Conquer: A Framework For Time-Series Forecasting

Exploring structural components in autoencoder-based data clustering

FeatTS: Feature-based Time Series Clustering

Autoencoder-Enhanced Clustering: A Dimensionality Reduction Approach to Financial Time Series

Learning Representations for Incomplete Time Series Clustering

Intelligent Trading System: Multidimensional financial time series clustering