Side Information in the Binary Stochastic Block Model: Exact Recovery

Hussein Saad,Ahmed Abotabl,Aria Nosratinia
DOI: https://doi.org/10.48550/arXiv.1708.04972
2017-08-16
Information Theory
Abstract:In the community detection problem, one may have access to additional observations (side information) about the label of each node. This paper studies the effect of the quality and quantity of side information on the phase transition of exact recovery in the binary symmetric stochastic block model (SBM) with $n$ nodes. When the side information consists of the label observed through a binary symmetric channel with crossover probability $\alpha$, and when $\log(\frac{1-\alpha}{\alpha}) =O(\log(n))$, it is shown that side information has a positive effect on phase transition; the new phase transition under this condition is characterized. When $\alpha$ is constant or approaches zero sufficiently slowly, i.e., $\log(\frac{1-\alpha}{\alpha}) = o(\log(n))$, it is shown that side information does not help exact recovery. When the side information consists of the label observed through a binary erasure channel with parameter $\epsilon$, and when $\log(\epsilon)=O(\log(n))$, it is shown that side information improves exact recovery and the new phase transition is characterized. If $\log(\epsilon)=o(\log(n))$, then it is shown that side information is not helpful. The results are then generalized to an arbitrary side information of finite cardinality. Necessary and sufficient conditions are derived for exact recovery that are tight, except for one special case under $M$-ary side information. An efficient algorithm that incorporates the effect of side information is proposed that uses a partial recovery algorithm combined with a local improvement procedure. Sufficient conditions are derived for exact recovery under this efficient algorithm.
What problem does this paper attempt to address?