Robust recovery for stochastic block models, simplified and generalized

Sidhanth Mohanty,Prasad Raghavendra,David X. Wu
2024-02-22
Abstract:We study the problem of $\textit{robust community recovery}$: efficiently recovering communities in sparse stochastic block models in the presence of adversarial corruptions. In the absence of adversarial corruptions, there are efficient algorithms when the $\textit{signal-to-noise ratio}$ exceeds the $\textit{Kesten--Stigum (KS) threshold}$, widely believed to be the computational threshold for this problem. The question we study is: does the computational threshold for robust community recovery also lie at the KS threshold? We answer this question affirmatively, providing an algorithm for robust community recovery for arbitrary stochastic block models on any constant number of communities, generalizing the work of Ding, d'Orsi, Nasser & Steurer on an efficient algorithm above the KS threshold in the case of $2$-community block models.
Data Structures and Algorithms,Probability
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper investigates the problem of efficiently recovering community structures from Stochastic Block Models (SBM) in the presence of adversarial perturbations. Specifically, the paper focuses on the problem of **robust community recovery**: how to efficiently recover community structures in SBM under adversarial perturbations. In the absence of adversarial perturbations, efficient algorithms already exist to recover community structures when the signal-to-noise ratio exceeds the Kesten–Stigum (KS) threshold. The KS threshold is considered the computational threshold for this problem. The core question of this paper is: does the computational threshold for robust community recovery also lie at the KS threshold? The authors affirmatively answer this question and provide a robust community recovery algorithm for SBM with any number of communities, extending the efficient algorithm proposed by Ding, d’Orsi, Nasser & Steurer [DdNS22] for the 2-community block model. ### Main Contributions 1. **Anomalous Eigenvectors of the Bethe Hessian Matrix**: The authors confirm that the Bethe Hessian matrix indeed has anomalous eigenvectors related to community structures near the KS threshold and validate this by explicitly constructing these eigenvectors. 2. **Robust PCA Algorithm for Sparse Matrices**: The authors develop a robust Principal Component Analysis (PCA) algorithm for sparse matrices that can partially recover the top eigenspace under adversarial perturbations. 3. **Rounding Algorithm for Community Assignment**: The authors propose a rounding algorithm to convert vertex vector assignments into community assignments, inspired by Charikar & Wirth [CW04] for the 2XOR problem. ### Method Overview 1. **Preprocessing**: First, preprocess the perturbed graph by truncating high-degree nodes to remove localized perturbations. 2. **Constructing a Graph-Aware Symmetric Matrix**: Construct an appropriate graph-aware symmetric matrix whose negative eigenvalues contain the true community information of the unperturbed graph. 3. **Trimming and Spectral Algorithm**: Recursively trim the rows and columns of the matrix to remove small negative eigenvalues, then use a spectral algorithm to robustly recover the subspace containing community information. 4. **Rounding to Communities**: Convert the recovered subspace into community assignments, ensuring constant correlation with the true communities. ### Conclusion The main contribution of this paper is providing a robust community recovery algorithm that can efficiently recover community structures in the presence of adversarial perturbations, achieving performance at the KS threshold. This result not only extends existing algorithms for the 2-community block model but also provides theoretical support for more general SBM models.