A K-Prototypes Algorithm Based on Adaptive Determination of the Initial Centroids

Dongwei Guo,Yingjie Chen,Jingwen Chen
DOI: https://doi.org/10.1145/3195106.3195159
2018-01-01
Abstract:K-prototypes is an algorithm that deals with mixed data clustering. However, the clustering parameter k needs to be manually set and the initial centroids are randomly selected, therefore, it can lead to the blindness of cluster number as well as the problems of low clustering accuracy and unstable clustering results. Regarding the issue above, this paper proposes a strategy to determine the initial centroids adaptively based on density and distance, which can determine the number of clusters adaptively and choose the initial centroids better than the original algorithm. According to the experiments of UCI datasets, this algorithm is superior to the traditional k-prototypes algorithm and fuzzy k-prototypes algorithm in clustering quality and stability.
What problem does this paper attempt to address?