Semi-supervised Clustering Guided by Pairwise Constraints and Local Density Structures

Zhiguo Long,Yang Gao,Hua Meng,Yuxu Chen,Hui Kou
DOI: https://doi.org/10.1016/j.patcog.2024.110751
IF: 8
2024-01-01
Pattern Recognition
Abstract:Clustering based on local density peaks and graph cut (LDP-SC) is one of the state-of-the-art algorithms in unsupervised clustering, which first divides the data set to be multiple local trees, and then aggregates these local trees to obtain the final clustering result. However, for complex data sets, there might exist data points from different classes in the same local tree. In this article, we use pairwise constraint information to resolve this issue and propose a semi-supervised local density peaks and graph cut based clustering algorithm (SLDPC). In particular, SLDPC proposes intra-cluster conflict resolution and inter-cluster conflict resolution steps to split the local trees which are inconsistent with the provided pairwise constraint information. Theoretically, we show that the two steps will finish in a finite number of operations and the split local trees will be consistent with the pairwise constraint information. Subsequently, root node redirection and noise filtering steps are designed to avoid the local trees becoming too fragmented. Finally, we exploit the E2CP algorithm to further improve the similarity matrix between local trees using the pairwise constraint information, and the spectral clustering algorithm is adopted to obtain the clustering result. Experiments on multiple widely used synthetic and real-world data sets show that SLDPC is superior to LDP-SC and several other semi-supervised prominent clustering algorithms for most of the cases.
What problem does this paper attempt to address?