Abstract:This article proposes a novel Bayesian classification framework for networks with labeled nodes. While literature on statistical modeling of network data typically involves analysis of a single network, the recent emergence of complex data in several biological applications, including brain imaging studies, presents a need to devise a network classifier for subjects. This article considers an application from a brain connectome study, where the overarching goal is to classify subjects into two separate groups based on their brain network data, along with identifying influential regions of interest (ROIs) (referred to as nodes). Existing approaches either treat all edge weights as a long vector or summarize the network information with a few summary measures. Both these approaches ignore the full network structure, may lead to less desirable inference in small samples and are not designed to identify significant network nodes. We propose a novel binary logistic regression framework with the network as the predictor and a binary response, the network predictor coefficient being modeled using a novel class global-local shrinkage priors. The framework is able to accurately detect nodes and edges in the network influencing the classification. Our framework is implemented using an efficient Markov Chain Monte Carlo algorithm. Theoretically, we show asymptotically optimal classification for the proposed framework when the number of network edges grows faster than the sample size. The framework is empirically validated by extensive simulation studies and analysis of a brain connectome data.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to perform classification in high - dimensional network data and simultaneously identify the nodes and edges that have a significant impact on the classification results. Specifically, the paper focuses on how to effectively classify samples (such as individuals) into two different groups (for example, classify according to high or low IQ) given network data with labeled nodes (such as brain connectome data), and be able to identify which brain regions (nodes) and their connections (edges) have an important impact on the classification results. The paper proposes a new Bayesian classification framework, which utilizes network global - local shrinkage priors. Through this method, the paper aims to overcome several limitations in existing methods: 1. **Limitations of existing methods**: - Existing methods usually regard all edge weights as a long vector or use a few network summary indicators to summarize network information. These methods ignore the complete network structure, which may lead to poor inference in small - sample cases and are unable to design methods for identifying important network nodes. - These methods perform poorly when dealing with complex data (such as data in brain imaging studies) because they do not fully utilize the structural features of network data. 2. **Solutions proposed in the paper**: - The paper proposes a binary logistic regression framework, in which the network is used as a predictor variable and the binary response variable represents the probability that a sample belongs to a certain category. - A new class of network global - local shrinkage priors is used. This prior can introduce low - rank and near - sparse structures into the model, thereby better capturing the interaction effects and residual effects in the network. - The posterior calculation of the model is implemented through an efficient Markov chain Monte Carlo (MCMC) algorithm. 3. **Theoretical contributions**: - The paper also theoretically proves that the proposed framework has asymptotically optimal classification performance under specific conditions, especially when the number of network edges grows at a super - linear rate. - It provides insights into how the sparsity of the number of network nodes or the true network prediction coefficients changes with the sample size to achieve asymptotically optimal classification. In summary, by proposing a new Bayesian classification framework, this paper not only improves the accuracy of classification but also can identify network nodes and edges that have a significant impact on the classification results, which is of great significance for understanding complex network data (such as brain connectome data).

High Dimensional Bayesian Network Classification with Network Global-Local Shrinkage Priors

Bayesian Regression with Undirected Network Predictors with an Application to Brain Connectome Data

Covariate-Dependent Clustering of Undirected Networks with Brain-Imaging Data

Accounting for network noise in graph-guided Bayesian modeling of structured high-dimensional data

Brain Network Classification Based on Dynamic Graph Attention Information Bottleneck.

Bayesian thresholded modeling for integrating brain node and network predictors

Spatial-Temporal Dynamic Hypergraph Information Bottleneck for Brain Network Classification

Layer adaptive node selection in Bayesian neural networks: Statistical guarantees and implementation details

High Dimensional Logistic Regression Under Network Dependence

Scalable Bayesian variable selection for structured high‐dimensional data

Bayesian Approach to Linear Bayesian Networks

Bayesian Approaches for Large Biological Networks

High-Dimensional Bayesian Network Inference From Systems Genetics Data Using Genetic Node Ordering

A modeling framework for detecting and leveraging node-level information in Bayesian network inference

Bayes optimal learning in high-dimensional linear regression with network side information

Multilevel Bayesian Deep Neural Networks

A Full Bayesian Approach to Sparse Network Inference Using Heterogeneous Datasets

Bayesian Inference of Networks Across Multiple Sample Groups and Data Types

A tree-like Bayesian structure learning algorithm for small-sample datasets from complex biological model systems

Variational Bayes Neural Network: Posterior Consistency, Classification Accuracy and Computational Challenges