Information Maximization Clustering Via Multi-View Self-Labelling

Foivos Ntelemis,Yaochu Jin,Spencer A. Thomas
DOI: https://doi.org/10.1016/j.knosys.2022.109042
IF: 8.139
2022-01-01
Knowledge-Based Systems
Abstract:Image clustering is a particularly challenging computer vision task, which aims to generate annotations without human supervision. Recent advances focus on the use of self-supervised learning strategies in image clustering, by first learning valuable semantics and then clustering the image representations. These multiple-phase algorithms, however, involve several hyper-parameters and transformation functions, and are computationally intensive. By extending the self-supervised approach, this work proposes a novel single-phase clustering method that simultaneously learns meaningful representations and assigns the corresponding annotations. This is achieved by integrating a discrete representation into the self-supervised paradigm through a classifier net. Specifically, the proposed clustering objective employs mutual information, and maximizes the dependency between the integrated discrete representation and a discrete probability distribution. The discrete probability distribution is derived through the self-supervised process by comparing the learnt latent representation with a set of trainable prototypes. To enhance the learning performance of the classifier, we jointly apply the mutual information across multi-crop views. Our empirical results show that the proposed framework outperforms state-of-the-art techniques with an average clustering accuracy of 89.1%, 49.0%, 83.1% and 27.9%, respectively, on the baseline datasets of CIFAR-10, CIFAR-100/20, STL10 and TinyImageNet/200. Finally, the proposed method also demonstrates attractive robustness to parameter settings, and to a large number of classes, making it ready to be applicable to other datasets. The implementation of our method is available online.
What problem does this paper attempt to address?