Using MM principles to deal with incomplete data in K-means clustering

Ali Beikmohammadi
DOI: https://doi.org/10.48550/arXiv.2212.12379
2022-12-23
Abstract:Among many clustering algorithms, the K-means clustering algorithm is widely used because of its simple algorithm and fast convergence. However, this algorithm suffers from incomplete data, where some samples have missed some of their attributes. To solve this problem, we mainly apply MM principles to restore the symmetry of the data, so that K-means could work well. We give the pseudo-code of the algorithm and use the standard datasets for experimental verification. The source code for the experiments is publicly available in the following link: \url{<a class="link-external link-https" href="https://github.com/AliBeikmohammadi/MM-Optimization/blob/main/mini-project/MM%20K-means.ipynb" rel="external noopener nofollow">this https URL</a>}.
Machine Learning
What problem does this paper attempt to address?