The FRCK Clustering Algorithm for Determining Cluster Number and Removing Outliers Automatically.

Yubin Guo,Yuhang Wu,Xiaopeng Zhang,Aofeng Bo,Ximing Li
DOI: https://doi.org/10.1504/ijcse.2021.118097
2021-01-01
International Journal of Computational Science and Engineering
Abstract:Clustering algorithm is one of the most popular unsupervised algorithms for data grouping. The K-means algorithm is a popular clustering algorithm for its simplicity, ease of implementation and efficiency. But for K-means algorithm, the optical cluster number is difficult to predict, while it is sensitive to outliers. In this paper, we divide outliers into two types, and then prompt a clustering algorithm to remove the two-type outliers and calculate the optimal cluster number in each clustering iteration. The algorithm is a fusion of rough clustering and K-means, abbreviated as FRCK algorithm. In the FRCK algorithm, outliers are removed precisely, therefore the optical cluster number can be more accurate, and the quality of clustering result can be improved accordingly. And this algorithm is proven effective by experiment.
What problem does this paper attempt to address?