Locally Estimating Core Numbers

Michael P. O'Brien,Blair D. Sullivan
DOI: https://doi.org/10.48550/arXiv.1410.6793
2014-11-06
Abstract:Graphs are a powerful way to model interactions and relationships in data from a wide variety of application domains. In this setting, entities represented by vertices at the "center" of the graph are often more important than those associated with vertices on the "fringes". For example, central nodes tend to be more critical in the spread of information or disease and play an important role in clustering/community formation. Identifying such "core" vertices has recently received additional attention in the context of {\em network experiments}, which analyze the response when a random subset of vertices are exposed to a treatment (e.g. inoculation, free product samples, etc). Specifically, the likelihood of having many central vertices in any exposure subset can have a significant impact on the experiment. We focus on using $k$-cores and core numbers to measure the extent to which a vertex is central in a graph. Existing algorithms for computing the core number of a vertex require the entire graph as input, an unrealistic scenario in many real world applications. Moreover, in the context of network experiments, the subgraph induced by the treated vertices is only known in a probabilistic sense. We introduce a new method for estimating the core number based only on the properties of the graph within a region of radius $\delta$ around the vertex, and prove an asymptotic error bound of our estimator on random graphs. Further, we empirically validate the accuracy of our estimator for small values of $\delta$ on a representative corpus of real data sets. Finally, we evaluate the impact of improved local estimation on an open problem in network experimentation posed by Ugander et al.
Social and Information Networks
What problem does this paper attempt to address?