Abstract:Self-supervised learning (SSL) provides a promising alternative for representation learning on hypergraphs without costly labels. However, existing hypergraph SSL models are mostly based on contrastive methods with the instance-level discrimination strategy, suffering from two significant limitations: (1) They select negative samples arbitrarily, which is unreliable in deciding similar and dissimilar pairs, causing training bias. (2) They often require a large number of negative samples, resulting in expensive computational costs. To address the above issues, we propose SE-HSSL, a hypergraph SSL framework with three sampling-efficient self-supervised signals. Specifically, we introduce two sampling-free objectives leveraging the canonical correlation analysis as the node-level and group-level self-supervised signals. Additionally, we develop a novel hierarchical membership-level contrast objective motivated by the cascading overlap relationship in hypergraphs, which can further reduce membership sampling bias and improve the efficiency of sample utilization. Through comprehensive experiments on 7 real-world hypergraphs, we demonstrate the superiority of our approach over the state-of-the-art method in terms of both effectiveness and efficiency.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve two key problems in Hypergraph Self - Supervised Learning (HSSL): 1. **Training Bias**: - Most existing hypergraph self - supervised learning models are based on contrastive learning methods and use instance - level discrimination strategies. This method is arbitrary when selecting negative samples, making it difficult to accurately judge similar and dissimilar node pairs or hyper - edge pairs, thus introducing training bias. For example, in a co - author hypergraph, if authors in the same research field but with different cooperation frequencies are regarded as negative samples, the model may learn incorrect representations. 2. **Sampling Inefficiency**: - These models usually require a large number of negative samples to achieve optimal performance, which brings high computational costs, especially when dealing with large - scale hypergraphs. For example, the TriCL method has a time complexity of \(O(|V|\times|E|)\) when calculating the scoring function, where \(|V|\) and \(|E|\) represent the number of nodes and hyper - edges respectively. This complexity greatly limits the training speed. To address these problems, the paper proposes SE - HSSL (Sampling - Efficient Hypergraph Self - Supervised Learning), an efficient hypergraph self - supervised learning framework. SE - HSSL solves the above problems in the following ways: - **Introducing Sampling - Free Self - Supervised Signals**: SE - HSSL introduces node - level and group - level self - supervised signals based on Canonical Correlation Analysis (CCA). These signals do not need to rely on negative samples, thus reducing training bias and improving the discriminability of representations. - **Designing Hierarchical Membership - Level Contrastive Objectives**: SE - HSSL proposes a new hierarchical membership - level contrastive objective, which uses the cascading overlap relationships in hypergraphs to reduce sampling bias in membership - level learning and significantly reduces the number of required negative samples. This not only improves the effectiveness of the model but also its efficiency. Through experiments on 7 real - world hypergraph datasets, the paper proves that SE - HSSL is superior to existing methods in both effectiveness and efficiency.

Hypergraph Self-supervised Learning with Sampling-efficient Signals

Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive

Homophily-Enhanced Self-Supervision for Graph Structure Learning: Insights and Directions.

HypeBoy: Generative Self-Supervised Representation Learning on Hypergraphs

Decoupled Self-supervised Learning for Non-Homophilous Graphs

Automated Self-Supervised Learning for Graphs

Embedding Global Contrastive and Local Location in Self-Supervised Learning

Scalable Graph Self-Supervised Learning

Hyperspectral Image Classification With Contrastive Self-Supervised Learning Under Limited Labeled Samples

Learning Where to Learn in Cross-View Self-Supervised Learning

On the Discriminability of Self-Supervised Representation Learning

Robust Self-Tuning Semi-Supervised Learning.

Enhancing Representations through Heterogeneous Self-Supervised Learning

Alleviating neighbor bias: augmenting graph self-supervise learning with structural equivalent positive samples

Self-Supervised Learning of Graph Neural Networks: A Unified Review

Self-Supervised Learning With Prediction of Image Scale and Spectral Order for Hyperspectral Image Classification

Learning What and Where to Learn: A New Perspective on Self-supervised Learning

Robust Hypergraph-Augmented Graph Contrastive Learning for Graph Self-Supervised Learning

On Improving the Algorithm-, Model-, and Data- Efficiency of Self-Supervised Learning

A New Self-Supervised Task on Graphs: Geodesic Distance Prediction

Select Your Own Counterparts: Self-Supervised Graph Contrastive Learning With Positive Sampling