Community Detection in Hypergraphs via Mutual Information Maximization

Jurgen Kritschgau,Daniel Kaiser,Oliver Alvarado Rodriguez,Ilya Amburg,Jessalyn Bolkema,Thomas Grubb,Fangfei Lan,Sepideh Maleki,Phil Chodrow,Bill Kay
2023-08-09
Abstract:The hypergraph community detection problem seeks to identify groups of related nodes in hypergraph data. We propose an information-theoretic hypergraph community detection algorithm which compresses the observed data in terms of community labels and community-edge intersections. This algorithm can also be viewed as maximum-likelihood inference in a degree-corrected microcanonical stochastic blockmodel. We perform the inference/compression step via simulated annealing. Unlike several recent algorithms based on canonical models, our microcanonical algorithm does not require inference of statistical parameters such as node degrees or pairwise group connection rates. Through synthetic experiments, we find that our algorithm succeeds down to recently-conjectured thresholds for sparse random hypergraphs. We also find competitive performance in cluster recovery tasks on several hypergraph data sets.
Discrete Mathematics,Social and Information Networks,Combinatorics,Optimization and Control
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper primarily aims to address the problem of community detection in hypergraphs. Specifically, the paper proposes an information-theoretic approach to identify groups of related nodes in hypergraphs. The main problems the paper attempts to solve are as follows: 1. **Hypergraph Community Detection**: - Identifying groups of related nodes in hypergraph data. 2. **Application of Information-Theoretic Methods**: - Proposing an information-theoretic method to compress the observed data in terms of community labels and community-edge intersections. 3. **Statistical Inference**: - The algorithm can be viewed as a maximum likelihood inference of the degree-corrected microcanonical stochastic block model. 4. **Application of Simulated Annealing Algorithm**: - Using the simulated annealing algorithm to complete the inference/compression step. 5. **Performance Evaluation on Sparse Random Hypergraphs**: - Validating the algorithm's performance on sparse random hypergraphs through synthetic experiments, achieving the recently conjectured threshold. 6. **Performance Comparison on Real Datasets**: - Conducting clustering recovery tasks on multiple hypergraph datasets and comparing the performance with existing methods. Overall, the paper aims to propose a novel information-theoretic approach to solve the problem of hypergraph community detection and implements an efficient solution using the simulated annealing algorithm, demonstrating good performance on both sparse random hypergraphs and real datasets.