Sparse Dual-Weighting Ensemble Clustering

Pan Xu,Hui Gao,Yixuan Wang
DOI: https://doi.org/10.1007/s10586-024-04864-y
2024-01-01
Cluster Computing
Abstract:Ensemble clustering methods incorporate multiple base clusterings to provide a more accurate and reliable result compared to traditional clustering methods and have consequently gained popularity in recent years. In this paper, we propose a novel ensemble clustering method, dubbed Sparse Dual-Weighting Ensemble Clustering (SDWEC), to thoroughly consider the quality and diversity of base clusterings, which are critical and overlooked in many existing ensemble clustering methods. Specifically, SDWEC employs a dual-weighting scheme, which adaptively weighs the importance of multiple base clusterings while considering the reliability of clusters within base clusterings. The sparse constraint enables SDWEC to effectively select and fuse the most beneficial information from base clusterings, thereby enhancing the clustering performance. Unlike many existing ensemble clustering methods that require extra post-processing to extract the indicator matrix, SDWEC directly learns the cluster indicators without any discretization, thus avoiding potential information loss. To solve the tricky optimization problem of SDWEC, we design an efficient alternating optimization algorithm with linear complexity and theoretical convergence guarantees. We conduct extensive experiments on eight real-world datasets to evaluate the performance of SDWEC. Experimental results against state-of-the-art ensemble clustering methods demonstrate the superiority of SDWEC in terms of clustering accuracy and robustness.
What problem does this paper attempt to address?