Covariate-Dependent Clustering of Undirected Networks with Brain-Imaging Data

Sharmistha Guha,Rajarshi Guhaniyogi
DOI: https://doi.org/10.1080/00401706.2024.2321930
2024-03-27
Technometrics
Abstract:This article focuses on model-based clustering of subjects based on the shared relationships of subject-specific networks and covariates in scenarios when there are differences in the relationship between networks and covariates for different groups of subjects. It is also of interest to identify the network nodes significantly associated with each covariate in each cluster of subjects. To address these methodological questions, we propose a novel nonparametric Bayesian mixture modeling framework with an undirected network response and scalar predictors. The symmetric matrix coefficients corresponding to the scalar predictors of interest in each mixture component involve low-rankness and group sparsity within the low-rank structure. While the low-rank structure in the network coefficients adds parsimony and computational efficiency, the group sparsity within the low-rank structure enables drawing inference on network nodes and cells significantly associated with each scalar predictor. Being a principled Bayesian mixture modeling framework, our approach allows model-based identification of the number of clusters, offers clustering uncertainty in terms of the co-clustering matrix and presents precise characterization of uncertainty in identifying network nodes significantly related to a predictor in each cluster. Empirical results in various simulation scenarios illustrate substantial inferential gains of the proposed framework in comparison with competitors. Analysis of a real brain connectome dataset using the proposed method provides interesting insights into the brain regions of interest (ROIs) significantly related to creative achievement in each cluster of subjects. Supplementary material shows the convergence rate for the posterior predictive density of the proposed model, additional simulation examples with model misspecification, full conditional distributions to run the Markov chain Monte Carlo (MCMC) algorithm and also presents traceplots for various model parameters to demonstrate convergence of the MCMC algorithm.
statistics & probability
What problem does this paper attempt to address?