Dynamic clustering for heterophilic stochastic block models with time-varying node memberships

Kevin Z Lin,Jing Lei
2024-03-09
Abstract:We consider a time-ordered sequence of networks stemming from stochastic block models where nodes gradually change memberships over time and no network at any single time point contains sufficient signal strength to recover its community structure. To estimate the time-varying community structure, we develop KD-SoS (kernel debiased sum-of-square), a method performing spectral clustering after a debiased sum-of-squared aggregation of adjacency matrices. Our theory demonstrates via a novel bias-variance decomposition that KD-SoS achieves consistent community detection of each network even when heterophilic networks do not require smoothness in the time-varying dynamics of between-community connectivities. We also prove the identifiability of aligning community structures across time based on how rapidly nodes change communities, and develop a data-adaptive bandwidth tuning procedure for KD-SoS. We demonstrate the utility and advantages of KD-SoS through simulations and a novel analysis of the time-varying dynamics in gene coordination in the human developing brain system.
Statistics Theory,Applications
What problem does this paper attempt to address?
The paper attempts to address the problem where the community structure of nodes in a temporally dynamic network changes over time, but the network signal at each individual time point is insufficient to recover its community structure. Specifically, the paper focuses on the scenario in heterogeneous Stochastic Block Models (SBMs) where nodes gradually change their community affiliation over time. Due to the weak network signal at each time point, traditional community detection methods cannot effectively identify these dynamically changing community structures. To tackle this challenge, the authors developed a method called KD-SoS (Kernel Debiased Sum-of-Square), which performs community detection by aggregating the debiased squared adjacency matrices through spectral clustering techniques. The main contributions of the paper include: 1. **Theoretical Guarantee**: Through a novel bias-variance decomposition, it is proven that the KD-SoS method can consistently estimate the community structure of each network even as the node community structure changes over time. 2. **Computational Efficiency**: The proposed method is not only theoretically sound but also computationally efficient, making it suitable for handling large-scale network data. 3. **Strong Adaptability**: The method imposes almost no restrictions on the changes in connection patterns, requiring only the positivity condition of the locally averaged squared connectivity matrix. 4. **Bandwidth Selection**: A data-adaptive bandwidth tuning procedure is proposed to determine the optimal smoothing parameter, which does not rely on the assumption of smooth temporal changes in community relationships. The paper demonstrates the effectiveness and advantages of the KD-SoS method through simulation experiments and real data analysis (such as the temporal dynamics analysis of gene co-expression networks during human brain development). Solving these problems is of great significance for understanding the dynamic community structure changes in complex networks, especially in the biomedical field, such as the study of gene expression networks.