Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection
Talya Eden,Dana Ron,C. Seshadhri
DOI: https://doi.org/10.48550/arXiv.1604.03661
2017-02-16
Abstract:We revisit the classic problem of estimating the degree distribution moments of an undirected graph. Consider an undirected graph $G=(V,E)$ with $n$ vertices, and define (for $s > 0$) $\mu_s = \frac{1}{n}\cdot\sum_{v \in V} d^s_v$. Our aim is to estimate $\mu_s$ within a multiplicative error of $(1+\epsilon)$ (for a given approximation parameter $\epsilon>0$) in sublinear time. We consider the sparse graph model that allows access to: uniform random vertices, queries for the degree of any vertex, and queries for a neighbor of any vertex. For the case of $s=1$ (the average degree), $\widetilde{O}(\sqrt{n})$ queries suffice for any constant $\epsilon$ (Feige, SICOMP 06 and Goldreich-Ron, RSA 08). Gonen-Ron-Shavitt (SIDMA 11) extended this result to all integral $s > 0$, by designing an algorithms that performs $\widetilde{O}(n^{1-1/(s+1)})$ queries.
We design a new, significantly simpler algorithm for this problem. In the worst-case, it exactly matches the bounds of Gonen-Ron-Shavitt, and has a much simpler proof. More importantly, the running time of this algorithm is connected to the degeneracy of $G$. This is (essentially) the maximum density of an induced subgraph. For the family of graphs with degeneracy at most $\alpha$, it has a query complexity of $\widetilde{O}\left(\frac{n^{1-1/s}}{\mu^{1/s}_s} \Big(\alpha^{1/s} + \min\{\alpha,\mu^{1/s}_s\}\Big)\right) = \widetilde{O}(n^{1-1/s}\alpha/\mu^{1/s}_s)$. Thus, for the class of bounded degeneracy graphs (which includes all minor closed families and preferential attachment graphs), we can estimate the average degree in $\widetilde{O}(1)$ queries, and can estimate the variance of the degree distribution in $\widetilde{O}(\sqrt{n})$ queries. This is a major improvement over the previous worst-case bounds. Our key insight is in designing an estimator for $\mu_s$ that has low variance when $G$ does not have large dense subgraphs.
Data Structures and Algorithms,Discrete Mathematics