Abstract:Hyperedge prediction is crucial in hypergraph analysis for understanding complex multi-entity interactions in various web-based applications, including social networks and e-commerce systems. Traditional methods often face difficulties in generating high-quality negative samples due to the imbalance between positive and negative instances. To address this, we present the Scalable and Effective Negative Sample Generation for Hyperedge Prediction (SEHP) framework, which utilizes diffusion models to tackle these challenges. SEHP employs a boundary-aware loss function that iteratively refines negative samples, moving them closer to decision boundaries to improve classification performance. SEHP samples positive instances to form sub-hypergraphs for scalable batch processing. By using structural information from sub-hypergraphs as conditions within the diffusion process, SEHP effectively captures global patterns. To enhance efficiency, our approach operates directly in latent space, avoiding the need for discrete ID generation and resulting in significant speed improvements while preserving accuracy. Extensive experiments show that SEHP outperforms existing methods in accuracy, efficiency, and scalability, representing a substantial advancement in hyperedge prediction techniques. Our code is available here.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in hypergraph analysis, due to the imbalance between positive and negative samples, it is very difficult to generate high - quality negative samples, which affects the performance of hyperedge prediction. Specifically: 1. **Challenges in negative sample generation**: - In hypergraphs, the number of potential negative samples is huge, resulting in a serious imbalance problem between positive and negative samples. - Traditional negative sample generation methods rely on fixed sampling schemes and are difficult to generalize to different datasets and application scenarios. 2. **Limitations of existing methods**: - Existing negative sample generation methods such as HyperSAGNN, NHP, etc., rely on rule - or random - sampling strategies and cannot effectively capture global patterns. - These methods are inefficient when dealing with large - scale hypergraphs and have high computational costs. 3. **Application problems of diffusion models**: - Diffusion models are usually used to generate positive samples, while the hyperedge prediction task requires the generation of negative samples. - Diffusion models operate in continuous spaces, while hyperedge prediction requires discrete node IDs. How to map continuous representations to discrete spaces is a challenge. To solve these problems, the authors proposed the Scalable and Effective Negative Sample Generation for Hyperedge Prediction (SEHP) framework. The main contributions of SEHP include: - **Boundary - aware loss function**: By iteratively pushing negative samples towards the decision boundary, the quality of negative samples is improved. - **Conditional diffusion model**: Using the structural information of sub - hypergraphs as a condition to generate negative samples that are more in line with global patterns. - **Direct operation in the latent space**: Avoiding the bottleneck of discrete ID generation, significantly improving the speed of generating negative samples while maintaining high accuracy. Through these improvements, SEHP has shown higher accuracy and efficiency than existing methods on multiple datasets, especially when dealing with large - scale hypergraphs.

Scalable and Effective Negative Sample Generation for Hyperedge Prediction

AHP: Learning to Negative Sample for Hyperedge Prediction

Hypergraph Learning: Methods and Practices

Diffusion-based Negative Sampling on Graphs for Link Prediction

Ada-HGNN: Adaptive Sampling for Scalable Hypergraph Neural Networks

Hypergraph contrastive attention networks for hyperedge prediction with negative samples evaluation

Efficient Link Prediction via GNN Layers Induced by Negative Sampling

Understanding Negative Sampling in Graph Representation Learning

Scalable Hypergraph Learning and Processing

Subgraph Pooling: Tackling Negative Transfer on Graphs

ENSG: Enhancing Negative Sampling in Graph Convolutional Networks for Recommendation Systems

HyperX: A Scalable Hypergraph Framework

SCE: Scalable Network Embedding from Sparsest Cut

Entity Similarity-Based Negative Sampling for Knowledge Graph Embedding

Edge Representation Learning with Hypergraphs

Automatic Hypergraph Generation for Enhancing Recommendation with Sparse Optimization

Graph Convolutional Neural Networks with Diverse Negative Samples via Decomposed Determinant Point Processes

Universal Knowledge Graph Embedding Framework Based on High-Quality Negative Sampling and Weighting

Network embedding based on high-degree penalty and adaptive negative sampling

Hyperbolic Hierarchical Knowledge Graph Embeddings for Link Prediction in Low Dimensions

Leveraging Network Structure for Efficient Dynamic Negative Sampling in Network Embedding