Abstract:Hyperbolic manifolds for visual representation learning allow for effective learning of semantic class hierarchies by naturally embedding tree-like structures with low distortion within a low-dimensional representation space. The highly separable semantic class hierarchies produced by hyperbolic learning have shown to be powerful in low-shot tasks, however, their application in self-supervised learning is yet to be explored fully. In this work, we explore the use of hyperbolic representation space for self-supervised representation learning for prototype-based clustering approaches. First, we extend the Masked Siamese Networks to operate on the Poincaré ball model of hyperbolic space, secondly, we place prototypes on the ideal boundary of the Poincaré ball. Unlike previous methods we project to the hyperbolic space at the output of the encoder network and utilise a hyperbolic projection head to ensure that the representations used for downstream tasks remain hyperbolic. Empirically we demonstrate the ability of these methods to perform comparatively to Euclidean methods in lower dimensions for linear evaluation tasks, whilst showing improvements in extreme few-shot learning tasks.

What problem does this paper attempt to address?

### Main Problems Addressed by the Paper This paper primarily addresses the issue of leveraging hyperbolic space in self-supervised learning (SSL) to better embed the semantic hierarchical structure in natural images. ### Specific Goals and Contributions 1. **Proposing Hyperbolic Masked Siamese Networks (HMSN)**: - Extending Masked Siamese Networks (MSNs) to hyperbolic space (Poincaré ball model) to utilize the low-distortion embedding capability of hyperbolic space for tree-like structures. - Demonstrating through experiments that HMSN can perform comparably to Euclidean methods in linear evaluation tasks with fewer dimensions and show improvements in extreme few-shot learning tasks. 2. **Introducing Ideal Prototypes**: - Placing prototypes on the ideal boundary of the Poincaré ball to encourage full utilization of hyperbolic space. - Proposing a new loss function based on the Busemann distance metric to train the network to produce good hyperbolic representations. 3. **Proposing Hyperbolic Projection Head**: - Projecting Euclidean representations to the hyperbolic space of the Poincaré ball at the encoder output to ensure that the representations used in downstream tasks retain hyperbolic properties. - Using a fully hyperbolic projection network to ensure that the learned hyperbolicity can be utilized in downstream tasks. ### Overview of Experimental Results - **Linear Classification**: HMSN-IP performs similarly to the MSN baseline in linear classification on the ImageNet-1K dataset but uses fewer embedding dimensions (64 dimensions compared to 256 dimensions). - **Few-Shot Linear Classification**: HMSN-IP outperforms its Euclidean baseline in few-shot linear classification tasks using only 1% of labeled samples, with a performance improvement of 1.0%. - **Impact of Projection Head**: HMSN-IP with a hyperbolic projection head achieves higher performance under a hyperbolic linear classifier, demonstrating the importance of hyperbolic properties in downstream tasks. In summary, by introducing the concept of hyperbolic space and corresponding technical improvements, this paper aims to enhance the performance of self-supervised learning methods in few-shot learning tasks and validates the effectiveness of the proposed improvements through experiments.

HMSN: Hyperbolic Self-Supervised Learning by Clustering with Ideal Prototypes

A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry

Enhancing Hyperbolic Graph Embeddings via Contrastive Learning

Self-Supervised 3D Behavior Representation Learning Based on Homotopic Hyperbolic Embedding

Hyperbolic Deep Learning in Computer Vision: A Survey

Hyperbolic Representation Learning: Revisiting and Advancing

Hyperspherical Prototype Networks

Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin

Thyroid-thyrotropic hormone balance in the blood of normal and endocrinopathic individuals.

Fully Hyperbolic Convolutional Neural Networks for Computer Vision

Exploring Hierarchical Information in Hyperbolic Space for Self-Supervised Image Hashing

Hyperbolic Contrastive Learning for Visual Representations beyond Objects

A Regularized Approach for Geodesic-Based Semisupervised Multimanifold Learning

Nested Hyperbolic Spaces for Dimensionality Reduction and Hyperbolic NN Design

Provably Accurate and Scalable Linear Classifiers in Hyperbolic Spaces

Unsupervised Feature Learning with Emergent Data-Driven Prototypicality

Hyperbolic Convolutional Neural Networks

Learning Structured Representations with Hyperbolic Embeddings

Hyperbolic Graph Representation Learning: A Tutorial

Hyper-Laplacian Regularized Multilinear Multiview Self-Representations for Clustering and Semisupervised Learning

Hyperbolic Deep Neural Networks: A Survey