Consistent Recovery Threshold of Hidden Nearest Neighbor Graphs

Jian Ding,Yihong Wu,Jiaming Xu,Dana Yang
DOI: https://doi.org/10.1109/tit.2021.3085773
IF: 2.5
2021-01-01
IEEE Transactions on Information Theory
Abstract:Motivated by applications such as discovering strong ties in social networks and assembling genome subsequences in biology, we study the problem of recovering a hidden 2k-nearest neighbor (NN) graph in an n-vertex complete graph, whose edge weights are independent and distributed according to Pn for edges in the hidden 2k-NN graph and Q(n) otherwise. The special case of Bernoulli distributions corresponds to a variant of the Watts-Strogatz small-world graph. We focus on two types of asymptotic recovery guarantees as n -> infinity : (1) exact recovery: all edges are classified correctly with probability tending to one; (2) almost exact recovery: the expected number of misclassified edges is o(nk). We show that the maximum likelihood estimator achieves (1) exact recovery for 2 <= k <= n (o(1)) if lim inf 2n/log n > 1; (2) almost exact recovery for 1 <= k <= 0 (log n/ log log n) if lim inf kD(P-n parallel to Q(n))/log n > 1, where alpha(n) (SIC) -2 log integral root dP(n)dQ(n) is the Renyi divergence of order 1/2 and D(P-n parallel to Q(n)) is the Kullback-Leibler divergence. Under mild distributional assumptions, these conditions are shown to be information-theoretically necessary for any algorithm to succeed. A key challenge in the analysis is the enumeration of 2k-NN graphs that differ from the hidden one by a given number of edges. We also analyze several computationally efficient algorithms and provide sufficient conditions under which they achieve exact/almost exact recovery. In particular, we develop a polynomial-time algorithm that attains the threshold for exact recovery under the small-world model.
What problem does this paper attempt to address?