Fast nonparametric inference of network backbones for graph sparsification

Alec Kirkley
2024-09-10
Abstract:A network backbone provides a useful sparse representation of a weighted network by keeping only its most important links, permitting a range of computational speedups and simplifying complex network visualizations. There are many possible criteria for a link to be considered important, and hence many methods have been developed for the task of network backboning for graph sparsification. These methods can be classified as global or local in nature depending on whether they evaluate the importance of an edge in the context of the whole network or an individual node neighborhood. A key limitation of existing network backboning methods is that they either artificially restrict the topology of the backbone to take a specific form (e.g. a tree) or they require the specification of a free parameter (e.g. a significance level) that determines the number of edges to keep in the backbone. Here we develop a completely nonparametric framework for inferring the backbone of a weighted network that overcomes these limitations by automatically selecting the optimal number of edges to retain in the backbone using the Minimum Description Length (MDL) principle from information theory. We develop two encoding schemes that serve as objective functions for global and local network backbones, as well as efficient optimization algorithms to identify the optimal backbones according to these objectives with runtime complexity log-linear in the number of edges. We show that the proposed framework is generalizable to any discrete weight distribution on the edges using a maximum a posteriori (MAP) estimation procedure with an asymptotically equivalent Bayesian generative model of the backbone. We compare the proposed method with existing methods in a range of tasks on real and synthetic networks.
Social and Information Networks,Physics and Society
What problem does this paper attempt to address?
This paper aims to solve two key problems in network skeleton extraction: 1. **Limitations of existing methods**: Existing network skeleton extraction methods either artificially limit the topological structure of the skeleton (for example, restricting it to a tree - like structure), or need to specify a free parameter (such as the significance level) to decide how many edges to retain. These limitations make the methods less flexible and rely on the subjective choices of users. 2. **Development of a non - parametric framework**: To overcome the above limitations, the paper proposes a completely non - parametric framework that uses the Minimum Description Length (MDL) principle in information theory to automatically select the optimal number of edges to construct the skeleton of a weighted network. This method does not require users to specify any parameters and can adaptively determine which edges to retain, thus providing a more flexible and more principled solution. Specifically, the paper proposes two encoding schemes for constructing global and local network skeletons respectively, and develops efficient optimization algorithms to identify the optimal skeleton. These methods have a log - linear advantage in running - time complexity and are suitable for large - scale networks. In addition, through the Maximum A Posteriori (MAP) estimation process, the paper also shows that the proposed framework can be generalized to edges with any discrete weight distribution, providing an interpretation of the generative model. In general, the main contribution of this paper is to provide a new method without parameter adjustment, which can efficiently extract network skeletons while maintaining the key properties of the network. This provides strong support for network sparsification and simplified representation of complex networks.