Fuzzy C-Means Clustering Validity Function Based on Multiple Clustering Performance Evaluation Components
Guan Wang,Jie-Sheng Wang,Hong-Yu Wang
DOI: https://doi.org/10.1007/s40815-021-01243-2
IF: 4.085
2022-02-21
International Journal of Fuzzy Systems
Abstract:Clustering is the process of grouping a set of physical or abstract objects into multiple similar objects. Fuzzy C-means (FCM) clustering is one of the most widely used clustering methods, whose main research goal is to find the optimal clustering number of data sets, which is related to whether the data can be effectively divided. The study of clustering validity function is the process of evaluating the clustering quality and determining the optimal clustering number. Based on the idea of components, six cluster performance evaluation components are proposed to define compactness, variation, similarity, overlap and separation of data sets, respectively. Then a new validity function based on FCM clustering algorithm is synthesized by these six components. Finally, the proposed validity function and eight typical validity functions are compared on five artificial data sets and eight UCI data sets. The simulation results show that the proposed clustering validity function can evaluate the clustering results more effectively and determine the optimal clustering number of different data sets.
computer science, information systems,automation & control systems, artificial intelligence