Pruning Edges and Gradients to Learn Hypergraphs from Larger Sets

David W. Zhang,Gertjan J. Burghouts,Cees G. M. Snoek
DOI: https://doi.org/10.48550/arXiv.2106.13919
2023-01-17
Abstract:This paper aims for set-to-hypergraph prediction, where the goal is to infer the set of relations for a given set of entities. This is a common abstraction for applications in particle physics, biological systems, and combinatorial optimization. We address two common scaling problems encountered in set-to-hypergraph tasks that limit the size of the input set: the exponentially growing number of hyperedges and the run-time complexity, both leading to higher memory requirements. We make three contributions. First, we propose to predict and supervise the \emph{positive} edges only, which changes the asymptotic memory scaling from exponential to linear. Second, we introduce a training method that encourages iterative refinement of the predicted hypergraph, which allows us to skip iterations in the backward pass for improved efficiency and constant memory usage. Third, we combine both contributions in a single set-to-hypergraph model that enables us to address problems with larger input set sizes. We provide ablations for our main technical contributions and show that our model outperforms prior state-of-the-art, especially for larger sets.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve two major problems encountered in the set - to - hypergraph prediction task: **memory scalability and computational complexity**. Specifically: 1. **Memory Scalability Problem**: As the size of the input set increases, the number of possible hyper - edges grows exponentially, leading to a sharp increase in memory requirements. This makes it infeasible to process larger input sets. 2. **Computational Complexity Problem**: Common combinatorial optimization problems introduce super - linear time complexity. For example, the time complexity of finding a convex hull in d - dimensional space is \(O(n \log(n)+n \left\lfloor \frac{d}{2} \right\rfloor)\), which means that larger input sets require more computational resources. To solve these problems, the paper makes the following three main contributions: ### 1. **Predict Only Positive Edges** By predicting and supervising only the edges where node connections exist (i.e., positive edges), the asymptotic memory requirement is improved from exponential \(O(2^n)\) to linear \(O(mn)\), where \(m\) is the number of existing edges. This method significantly reduces the memory complexity of sparse hypergraphs. ### 2. **Iterative Refinement Training Method** A training method that encourages iterative refinement of the predicted hypergraph is introduced, allowing certain iteration steps to be skipped during the back - propagation process, thereby achieving constant - level memory usage and higher efficiency. This solves the problem of more computational resources required for complex problems. ### 3. **A Single Model Combining the First Two Contributions** The above two methods are combined to construct a single model that can handle larger input sets. This model can handle input sets of different sizes and different numbers of edges while maintaining the symmetry of node and edge permutations. Through these improvements, the method proposed in the paper shows performance superior to the existing state - of - the - art methods in experiments, especially when dealing with larger input sets. ### Summary The main goal of the paper is to enable the set - to - hypergraph prediction task to handle larger - scale input sets and perform well in multiple benchmark tests by optimizing memory usage and improving computational efficiency.