Sensitivity and Reliability in Incomplete Networks: Centrality Metrics to Community Scoring Functions

Soumya Sarkar,Sanjukta Bhowmick,Suhansanu Kumar,Animesh Mukherjee
DOI: https://doi.org/10.48550/arXiv.1608.05295
2016-08-24
Abstract:Network analysis is an important tool in understanding the behavior of complex systems of interacting entities. However, due to the limitations of data gathering technologies, some interactions might be missing from the network model. This is a ubiquitous problem in all domains that use network analysis, from social networks to hyper-linked web networks to biological networks. Consequently, an important question in analyzing networks is to understand how increasing the noise level (i.e. percentage of missing edges) affects different network parameters. In this paper we evaluate the effect of noise on community scoring and centrality-based parameters with respect to two different aspects of network analysis: (i) sensitivity, that is how the parameter value changes as edges are removed and (ii) reliability in the context of message spreading, that is how the time taken to broadcast a message changes as edges are removed. Our experiments on synthetic and real-world networks and three different noise models demonstrate that for both the aspects over all networks and all noise models, permanence qualifies as the most effective metric. For the sensitivity experiments closeness centrality is a close second. For the message spreading experiments, closeness and betweenness centrality based initiator selection closely competes with permanence. This is because permanence has a dual characteristic where the cumulative permanence over all vertices is sensitive to noise but the ids of the top-rank vertices, which are used to find seeds during message spreading remain relatively stable under noise.
Social and Information Networks,Physics and Society
What problem does this paper attempt to address?
This paper attempts to solve two key problems in network analysis: sensitivity and reliability. Specifically: 1. **Sensitivity**: Study how the values of network parameters change after edge deletion (i.e., introducing noise). A sensitive parameter can serve as a good indicator of network structure changes, and the change in its value should be proportional to the noise level. For example, when a small number of edges are deleted in the network, the parameter value should not change much; while when a large number of edges are deleted, the parameter value should change significantly. 2. **Reliability**: Study whether certain operations (such as message propagation) can still be executed efficiently when the network is incomplete. A reliable parameter means that even in the presence of noise in the network, when using high - value nodes as seed nodes for message propagation, the propagation time will not increase significantly. The paper experimentally evaluates the performance of different network parameters (including centrality measures and community scoring functions) in these two aspects and proposes a metric standard of "permanence", which performs well in most cases and has both high sensitivity and good reliability.