Hybridization of K-means with improved firefly algorithm for automatic clustering in high dimension

Afroj Alam
DOI: https://doi.org/10.48550/arXiv.2302.10765
2023-02-10
Abstract:K-means Clustering is the most well-known partitioning algorithm among all clustering, by which we can partition the data objects very easily in to more than one clusters. However, for K-means to choose an appropriate number of clusters without any prior domain knowledge about the dataset is challenging, especially in high-dimensional data objects. Hence, we have implemented the Silhouette and Elbow methods with PCA to find an optimal number of clusters. Also, previously, so many meta-heuristic swarm intelligence algorithms inspired by nature have been employed to handle the automatic data clustering problem. Firefly is efficient and robust for automatic clustering. However, in the Firefly algorithm, the entire population is automatically subdivided into sub-populations that decrease the convergence rate speed and trapping to local minima in high-dimensional optimization problems. Thus, our study proposed an enhanced firefly, i.e., a hybridized K-means with an ODFA model for automatic clustering. The experimental part shows output and graphs of the Silhouette and Elbow methods as well as the Firefly algorithm
Machine Learning
What problem does this paper attempt to address?