L G ] 2 4 Se p 20 20 tk-means : A ROBUST AND STABLE k-means VARIANT

Yiming Li,Yang Zhang,Qingtao Tang,Weipeng Huang,Yong Jiang,Shu-Tao Xia
2020-01-01
Abstract:k-means algorithm is one of the most classical clustering methods, which has been widely and successfully used in signal processing. However, due to the thin-tailed property of the Gaussian distribution, k-means algorithm suffers from relatively poor performance on the dataset containing heavy-tailed data or outliers. Besides, standard k-means algorithm also has relatively weak stability, i.e. its results have a large variance, which reduces the model credibility. In this paper, we propose a robust and stable k-means variant, dubbed the t-k-means, as well as its fast version to alleviate those problems. Theoretically, we derive the t-k-means and analyze its robustness and stability from the aspect of the loss function and the expression of the clustering center, respectively. A large number of experiments are also conducted, which verify the effectiveness and efficiency of the proposed method.
What problem does this paper attempt to address?