Sampling Arbitrary Subgraphs Exactly Uniformly in Sublinear Time
Hendrik Fichtenberger,Mingze Gao,Pan Peng
DOI: https://doi.org/10.48550/arXiv.2005.01861
2020-05-04
Data Structures and Algorithms
Abstract:We present a simple sublinear-time algorithm for sampling an arbitrary subgraph $H$ \emph{exactly uniformly} from a graph $G$ with $m$ edges, to which the algorithm has access by performing the following types of queries: (1) degree queries, (2) neighbor queries, (3) pair queries and (4) edge sampling queries. The query complexity and running time of our algorithm are $\tilde{O}(\min\{m, \frac{m^{\rho(H)}}{\# H}\})$ and $\tilde{O}(\frac{m^{\rho(H)}}{\# H})$, respectively, where $\rho(H)$ is the fractional edge-cover of $H$ and $\# H$ is the number of copies of $H$ in $G$. For any clique on $r$ vertices, i.e., $H=K_r$, our algorithm is almost optimal as any algorithm that samples an $H$ from any distribution that has $\Omega(1)$ total probability mass on the set of all copies of $H$ must perform $\Omega(\min\{m, \frac{m^{\rho(H)}}{\# H\cdot (cr)^r}\})$ queries. Together with the query and time complexities of the $(1\pm \varepsilon)$-approximation algorithm for the number of subgraphs $H$ by Assadi, Kapralov and Khanna [ITCS 2018] and the lower bound by Eden and Rosenbaum [APPROX 2018] for approximately counting cliques, our results suggest that in our query model, approximately counting cliques is "equivalent to" exactly uniformly sampling cliques, in the sense that the query and time complexities of exactly uniform sampling and randomized approximate counting are within a polylogarithmic factor of each other. This stands in interesting contrast to an analogous relation between approximate counting and almost uniformly sampling for self-reducible problems in the polynomial-time regime by Jerrum, Valiant and Vazirani [TCS 1986].