A novel Move-Split-Merge based Fuzzy C-Means algorithm for clustering time series

Wei Ba,Zongquan Gu
DOI: https://doi.org/10.1007/s12530-024-09610-8
IF: 2.347
2024-10-22
Evolving Systems
Abstract:When faced with noisy time series data, significant challenges are encountered by clustering algorithms, including noise interference, temporal distortions, and irregular data patterns. In order to cope with the challenge of noisy time series data and to improve the performance of clustering algorithms, a Move-Split-Merge based Fuzzy C-Means algorithm (MSMFCM) is proposed. Firstly, dynamic wavelet basis functions as well as Median Absolute Deviation (MAD) are used to optimize the wavelets to reduce noise and highlight the actual data patterns in the original data. Secondly, a similarity matrix, constructed using the Move-Split-Merge (MSM) edit distance metric, quantitatively assesses the similarity between each pair of time series data points. Thirdly, to improve clustering efficiency, K-means + + is used to optimize the initial centers of the Fuzzy C-Means algorithm. Among twenty datasets, the performance of MSMFCM is compared with that of K-means, K-medoids, Fuzzy C-Means, K-shape, and algorithms incorporating Dynamic Time Warping and the Longest Common Subsequence. Simulation results show that MSMFCM significantly outperforms its closest competitors in the Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI) evaluation indicators, with an average improvement of 26.09% for ARI and 18.86% for NMI. It means that MSMFCM has better clustering performance for noisy time series data, which will provide the application of clustering on a wider range of data.
computer science, artificial intelligence
What problem does this paper attempt to address?