Combining core points and cluster-level semantic similarity for self-supervised clustering

Wenjie Wang,Junfen Chen,Xiao Zhang,Bojun Xie
DOI: https://doi.org/10.1007/s13042-023-02084-1
2024-02-11
International Journal of Machine Learning and Cybernetics
Abstract:Contrastive learning utilizes data augmentation to guide network training. This approach has attracted considerable attention for clustering, object detection, and image segmentation. However, previous studies have ignored the impact of false-negative pairs, resulting in the dissimilarity of the semantic representations of the same cluster. Some researchers have attempted to address this problem; however, only considering the image level has provided unsatisfactory results. To this end, we propose a novel feature extraction algorithm suitable for clustering, combining core points and semantic similarity at the cluster level to restructure positive and negative pairs. Specifically, the core points consisting of the n-nearest neighbors of the cluster center are considered the semantic sample relations of the cluster. This information is explored to reconstruct semantic positive and negative pairs to maximize intra-cluster similarity and inter-cluster variability. More accurate cluster centers offer a sub-optimal initialization for updating the feature model and clustering assignment, which is optimized by the expectation-maximization framework. Extensive experiments conducted on six benchmark datasets show promising clustering performances with relatively few training epochs. The proposed method outperforms the best baseline by 4%$$\%$$ (1.5%$$\%$$) on CIFAR-100 (CIFAR-10). The CPCS code is open-sourced at https://github.com/Cappuccino-Sugar/CPCS.
computer science, artificial intelligence
What problem does this paper attempt to address?