Algorithm 1038: KCC: A MATLAB Package for K-Means-based Consensus Clustering

Hao Lin,Hongfu Liu,Junjie Wu,Hong Li,Stephan Guennemann
DOI: https://doi.org/10.1145/3616011
IF: 2.464
2023-01-01
ACM Transactions on Mathematical Software
Abstract:Consensus clustering is gaining increasing attention for its high quality and robustness. In particular, k-means-based Consensus Clustering (KCC) converts the usual computationally expensive problem to a classic k-means clustering with generalized utility functions, bringing potentials for large-scale data clustering on different types of data. Despite KCC's applicability and generalizability, implementing this method such as representing the binary dataset in the k-means heuristic is challenging and has seldom been discussed in prior work. To fill this gap, we present a MATLAB package, KCC, that completely implements the KCC framework and utilizes a sparse representation technique to achieve a low space complexity. Compared to alternative consensus clustering packages, the KCC package is of high flexibility, efficiency, and effectiveness. Extensive numerical experiments are also included to show its usability on real-world datasets.
What problem does this paper attempt to address?