Bias-Corrected Joint Spectral Embedding for Multilayer Networks with Invariant Subspace: Entrywise Eigenvector Perturbation and Inference

Fangzheng Xie
2024-06-12
Abstract:In this paper, we propose to estimate the invariant subspace across heterogeneous multiple networks using a novel bias-corrected joint spectral embedding algorithm. The proposed algorithm recursively calibrates the diagonal bias of the sum of squared network adjacency matrices by leveraging the closed-form bias formula and iteratively updates the subspace estimator using the most recent estimated bias. Correspondingly, we establish a complete recipe for the entrywise subspace estimation theory for the proposed algorithm, including a sharp entrywise subspace perturbation bound and the entrywise eigenvector central limit theorem. Leveraging these results, we settle two multiple network inference problems: the exact community detection in multilayer stochastic block models and the hypothesis testing of the equality of membership profiles in multilayer mixed membership models. Our proof relies on delicate leave-one-out and leave-two-out analyses that are specifically tailored to block-wise symmetric random matrices and a martingale argument that is of fundamental interest for the entrywise eigenvector central limit theorem.
Statistics Theory,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the problem of estimating the invariant subspace in heterogeneous multi - layer networks. Specifically, the author proposes a new Bias - Corrected Joint Spectral Embedding (BCJSE) algorithm for estimating the invariant subspaces in multiple networks. This problem is of great significance in multiple application fields, such as social network analysis, biological network analysis, etc. ### Background and Problem Description of the Paper 1. **Importance of Network Data**: - Network data is a convenient data form for representing the relational structure between multiple entities. - Network data has wide applications in many fields such as social science, biology, and computer science. 2. **Statistical Network Analysis**: - The random graph model is the basis of statistical network analysis, in which vertices are regarded as deterministic, and the edges connecting different vertices are regarded as random variables. - Common random graph models include Stochastic Block Models (SBMs), Random Dot Product Graph Models (RDPGs), Latent Space Models, etc. 3. **Heterogeneous Multi - layer Networks**: - A heterogeneous multi - layer network refers to a data set composed of multiple aligned networks, and each network is called a layer. - These multi - layer networks naturally occur in fields such as trade networks, fMRI studies, protein networks, traffic networks, gene co - expression networks, and social networks. - The heterogeneity of multi - layer networks brings additional challenges and requires different methods and techniques to handle. ### Main Contributions of the Paper 1. **Proposing a New Bias - Corrected Joint Spectral Embedding Algorithm**: - This algorithm estimates the invariant subspace by recursively correcting the diagonal deviation of the sum of the squares of the network adjacency matrices and iteratively updating the subspace estimator. - Compared with existing de - biasing algorithms (such as heteroscedastic PCA), this algorithm is more computationally efficient, numerically more stable, and can reach the optimal convergence rate with only \(O(1)\) iterations under mild conditions. 2. **Establishing the Element - wise Subspace Estimation Theory**: - The author establishes the element - wise subspace estimation theory of the BCJSE algorithm, including sharp element - wise subspace perturbation bounds and the element - wise eigenvector central limit theorem. - The proof relies on the leave - one - out and leave - two - out analyses specially designed for block - symmetric random matrices, as well as martingale arguments. 3. **Solving Two Multi - layer Network Inference Problems**: - Using the above theoretical results, the author solves the exact community detection problem in the multi - layer stochastic block model and the hypothesis testing problem of the equality of the member configuration profiles in the multi - layer mixed - membership model. - These problems are non - trivial extensions of the corresponding problems in single - layer networks because it is not required that the average expected degree of each layer network diverges, as long as the signal - to - noise ratio of the aggregated network is sufficient. ### Technical Details of the Paper - **Model Setup**: - Consider a set of \(m\) aligned network adjacency matrices \(\{A_t\}_{t = 1}^m\) with low - rank edge probability matrices \(\{P_t\}_{t = 1}^m\). - Assume that \(\text{rank}(P_t)=d\) for all \(t\in\{1,\ldots,m\}\) and they share the same principal subspace. - **Algorithm Description**: - The BCJSE algorithm estimates the invariant subspace \(U\) by recursively correcting the bias term \(M\). - Specific steps include initialization, inner - layer iterative bias correction, and outer - layer iterative update of the subspace estimator. - **Theoretical Results**: - The author establishes the element - wise subspace perturbation bounds and the element - wise eigenvector central limit theorem of the BCJSE algorithm. - These results show that under mild conditions, the BCJSE algorithm can reach sharp estimation error bounds with only \(O(1)\) iterations. ### Conclusion This paper proposes a new Bias - Corrected Joint Spectral Embedding algorithm for estimating the invariant subspace in heterogeneous multi - layer networks. By establishing the element - wise subspace estimation theory, the author successfully solves...