Generalized Hypergeometric Ensembles: Statistical Hypothesis Testing in Complex Networks

Giona Casiraghi,Vahan Nanumyan,Ingo Scholtes,Frank Schweitzer
DOI: https://doi.org/10.48550/arXiv.1607.02441
2016-08-08
Abstract:Statistical ensembles of networks, i.e., probability spaces of all networks that are consistent with given aggregate statistics, have become instrumental in the analysis of complex networks. Their numerical and analytical study provides the foundation for the inference of topological patterns, the definition of network-analytic measures, as well as for model selection and statistical hypothesis testing. Contributing to the foundation of these data analysis techniques, in this Letter we introduce generalized hypergeometric ensembles, a broad class of analytically tractable statistical ensembles of finite, directed and weighted networks. This framework can be interpreted as a generalization of the classical configuration model, which is commonly used to randomly generate networks with a given degree sequence or distribution. Our generalization rests on the introduction of dyadic link propensities, which capture the degree-corrected tendencies of pairs of nodes to form edges between each other. Studying empirical and synthetic data, we show that our approach provides broad perspectives for model selection and statistical hypothesis testing in data on complex networks.
Physics and Society,Social and Information Networks,Combinatorics,Data Analysis, Statistics and Probability
What problem does this paper attempt to address?