Abstract:In recent years, benefiting from the expressive power of Graph Convolutional Networks (GCNs), significant breakthroughs have been made in face clustering area. However, rare attention has been paid to GCN-based clustering on imbalanced data. Although imbalance problem has been extensively studied, the impact of imbalanced data on GCN- based linkage prediction task is quite different, which would cause problems in two aspects: imbalanced linkage labels and biased graph representations. The former is similar to that in classic image classification task, but the latter is a particular problem in GCN-based clustering via linkage prediction. Significantly biased graph representations in training can cause catastrophic over-fitting of a GCN model. To tackle these challenges, we propose a linkage-based doubly imbalanced graph learning framework for face clustering. In this framework, we evaluate the feasibility of those existing methods for imbalanced image classification problem on GCNs, and present a new method to alleviate the imbalanced labels and also augment graph representations using a Reverse-Imbalance Weighted Sampling (RIWS) strategy. With the RIWS strategy, probability-based class balancing weights could ensure the overall distribution of positive and negative samples; in addition, weighted random sampling provides diverse subgraph structures, which effectively alleviates the over-fitting problem and improves the representation ability of GCNs. Extensive experiments on series of imbalanced benchmark datasets synthesized from MS-Celeb-1M and DeepFashion demonstrate the effectiveness and generality of our proposed method. Our implementation and the synthesized datasets will be openly available on <a class="link-external link-https" href="https://github.com/espectre/GCNs_on_imbalanced_datasets" rel="external noopener nofollow">this https URL</a>.

Joint Debiased Representation Learning and Imbalanced Data Clustering

Joint Unsupervised Learning of Deep Representations and Image Clusters

Learning a Bi-Directional Discriminative Representation for Deep Clustering

Mejigclu: more effective jigsaw clustering for unsupervised visual representation learning

Graph Debiased Contrastive Learning with Joint Representation Clustering

Self-labelling via simultaneous clustering and representation learning

Joint Representation Learning and Clustering: A Framework for Grouping Partial Multiview Data.

Joint Shared-and-Specific Information for Deep Multi-View Clustering

Salient and consensus representation learning based incomplete multiview clustering

A Linkage-based Doubly Imbalanced Graph Learning Framework for Face Clustering

Learning deep discriminative representations with pseudo supervision for image clustering

Learning Robust Representation for Clustering Through Locality Preserving Variational Discriminative Network

Consensus Clustering With Unsupervised Representation Learning

CNN-Based Joint Clustering and Representation Learning with Feature Drift Compensation for Large-Scale Image Data

Learning Deep Representation with Energy-Based Self-Expressiveness for Subspace Clustering

Cluster Specific Representation Learning

Robust Representation Learning for Image Clustering

Efficient Unsupervised Visual Representation Learning with Explicit Cluster Balancing

Hierarchically Clustered Representation Learning

Joint Deep Multi-View Learning for Image Clustering

T-distributed Stochastic Neighbor Embedding for Co-Representation Learning