An Effective and Efficient Algorithm for K-means Clustering with New Formulation

Feiping Nie,Ziheng Li,Rong Wang,Xuelong Li
DOI: https://doi.org/10.1109/tkde.2022.3155450
IF: 9.235
2023-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:K-means is one of the most simple and popular clustering algorithms, which implemented as a standard clustering method in most of machine learning researches. The goal of K-means clustering is finding a set of cluster centers and minimizing the sum of squared distances between each sample and its nearest clustering center. In this paper, we proposed a novel K-means clustering algorithm, which reformulate the classical K-Means objective function as a trace maximization problem and then replace it with a new formulation. The proposed algorithm does not need to calculate the cluster centers in each iteration and requires fewer additional intermediate variables during the optimization process. In addition, we proposed an efficient iterative re-weighted algorithm to solve the involved optimization problem and provided the corresponding convergence analysis. The proposed algorithm keeps a consistent computational complexity as Lloyd's algorithm, $\mathcal {O}(ndk)$O(ndk), but shows a faster convergence rate in experiments. Extensive experimental results on real world benchmark datasets show the effectiveness and efficiency of the proposed algorithm.
What problem does this paper attempt to address?