Computational and Statistical Thresholds in Multi-layer Stochastic Block Models

Jing Lei,Anru R. Zhang,Zihan Zhu
2023-11-14
Abstract:We study the problem of community recovery and detection in multi-layer stochastic block models, focusing on the critical network density threshold for consistent community structure inference. Using a prototypical two-block model, we reveal a computational barrier for such multi-layer stochastic block models that does not exist for its single-layer counterpart: When there are no computational constraints, the density threshold depends linearly on the number of layers. However, when restricted to polynomial-time algorithms, the density threshold scales with the square root of the number of layers, assuming correctness of a low-degree polynomial hardness conjecture. Our results provide a nearly complete picture of the optimal inference in multiple-layer stochastic block models and partially settle the open question in Lei and Lin (2022) regarding the optimality of the bias-adjusted spectral method.
Statistics Theory
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper primarily investigates the problem of community recovery and detection in Multi-layer Stochastic Block Models (MLSBMs), with a focus on the critical network density threshold for consistent community structure inference. Specifically, the paper reveals a computational barrier in multi-layer stochastic block models that is absent in single-layer models: 1. **Without computational constraints**: When there are no computational constraints, the network density threshold is linearly related to the number of layers. 2. **With limited computational time algorithms**: When restricted to polynomial-time algorithms, the network density threshold is linearly related to the square root of the number of layers, assuming the low-degree polynomial hardness conjecture holds. The main contribution of the paper is characterizing the network density threshold required for consistent community recovery and detection in multi-layer stochastic block models and explaining the difference between the information-theoretic threshold and the computational threshold. These results partially address the open problem regarding the optimality of bias-adjusted spectral methods. ### Key Findings 1. **Information-theoretic threshold**: The linear signal accumulation rate corresponds to the information-theoretic threshold, which is the optimal condition for community recovery and detection without computational constraints. 2. **Computational threshold**: The square root signal accumulation rate corresponds to the computational threshold, which is the optimal condition for community recovery and detection under polynomial-time algorithms. ### Methods and Theoretical Framework - **Model Definition**: The paper defines two multi-layer stochastic block models: the balanced two-community model \( P_{1,n} \) and the null model \( P_{0,n} \). - **Asymptotic Behavior**: The asymptotic behavior is studied as the number of nodes \( n \), the number of layers \( T \), and the network density \( \rho \) tend to infinity. - **Computational Complexity**: A low-degree polynomial framework is used to analyze computational complexity, revealing that the computational difficulty mainly arises from the unknown layer identities. ### Main Results - **Information-theoretic upper and lower bounds**: If \( nT_n\rho_n \) tends to infinity, the community structure can be consistently recovered and detected. - **Computational upper and lower bounds**: Assuming the low-degree polynomial hardness conjecture holds, if \( nT_n^{1/2}\rho_n \) is greater than a certain constant, the community structure can be consistently recovered and detected in polynomial time; otherwise, no polynomial-time algorithm can achieve this goal. ### Conclusion Through theoretical analysis and computational complexity research, the paper provides a fundamental understanding of the problem of community recovery and detection in multi-layer stochastic block models, revealing the differences between the information-theoretic threshold and the computational threshold, and offering an important theoretical foundation for future research.