Abstract:Because of the large variety of cluster validation indices (CVIs), choosing the most suitable index is challenging. We assessed several CVIs using artificial binary data sets. Only a few CVI performed as expected with noisy data. Tau and silhouette widths proved to be the best geometric CVIs both for equal and unequal cluster sizes. Among non‐geometric indices, Crispness and OptimClass performed best. Aims Different clustering methods often classify the same data set differently. Selecting the "best" clustering solution from alternatives is possible with cluster validation indices. Because of the large variety of cluster validation indices (CVIs), choosing the most suitable index concerning the data set and clustering algorithms is challenging. We aim to assess different internal clustering validation indices. Methods Artificial binary data sets with equal‐ and unequal‐sized well‐separated a priori clusters were simulated and three levels of noise were then added. Twenty replications of each of the six types of data sets (two group sizes × three levels of noise) were created and analyzed by three clustering algorithms with Jaccard dissimilarity. Twenty‐seven clustering validation indices are evaluated including both geometric and non‐geometric indices. Results Although, in theory, all CVIs could differentiate between good and wrong classifications, only a few perform as expected with noisy data. Tau and silhouette widths proved to be the best geometric CVIs both for equal and unequal cluster sizes. Among non‐geometric indices, crispness and OptimClass performed best. Conclusion We recommend using these best‐performing CVIs. We suggest plotting the CVI value against the number of clusters because the lack of a sharp peak means that the position of the maximum is uncertain.

Understanding partition comparison indices based on counting object pairs

Comparing high dimensional partitions, with the Coclustering Adjusted Rand Index

On the Use of Relative Validity Indices for Comparing Clustering Approaches

Adjusting for Chance Clustering Comparison Measures

Graph Sensitive Indices for Comparing Clusterings

On Seeking Consensus Between Document Similarity Measures

Reliability of Partitioning Metric Space Data

On the Index of Cluster Validity

Performance evaluation of some clustering algorithms and validity indices

Normalised clustering accuracy: An asymmetric external cluster validity measure

A Validity Index Approach for Network Partitions

Extended multivariate comparison of 68 cluster validity indices. A review

Approximate Ranking from Pairwise Comparisons

Quantitative evaluation of internal cluster validation indices using binary data sets

A Comparative Review of Methods for Comparing Means Using Partially Paired Data

Word Clustering with Validity Indices

Near-Optimal Comparison Based Clustering

Towards quantification of incompleteness in the pairwise comparisons method

An Information-Theoretic External Cluster-Validity Measure

A Split-Merge Framework for Comparing Clusterings

Comparing clusterings and numbers of clusters by aggregation of calibrated clustering validity indexes