Accounting for network noise in graph-guided Bayesian modeling of structured high-dimensional data

Wenrui Li,Changgee Chang,Suprateek Kundu,Qi Long

DOI: https://doi.org/10.1093/biomtc/ujae012

IF: 1.701

2024-01-29

Biometrics

Abstract:Abstract There is a growing body of literature on knowledge-guided statistical learning methods for analysis of structured high-dimensional data (such as genomic and transcriptomic data) that can incorporate knowledge of underlying networks derived from functional genomics and functional proteomics. These methods have been shown to improve variable selection and prediction accuracy and yield more interpretable results. However, these methods typically use graphs extracted from existing databases or rely on subject matter expertise, which are known to be incomplete and may contain false edges. To address this gap, we propose a graph-guided Bayesian modeling framework to account for network noise in regression models involving structured high-dimensional predictors. Specifically, we use 2 sources of network information, including the noisy graph extracted from existing databases and the estimated graph from observed predictors in the dataset at hand, to inform the model for the true underlying network via a latent scale modeling framework. This model is coupled with the Bayesian regression model with structured high-dimensional predictors involving an adaptive structured shrinkage prior. We develop an efficient Markov chain Monte Carlo algorithm for posterior sampling. We demonstrate the advantages of our method over existing methods in simulations, and through analyses of a genomics dataset and another proteomics dataset for Alzheimer’s disease.

statistics & probability,mathematical & computational biology,biology

What problem does this paper attempt to address?

The paper aims to address the issue of network noise in regression modeling of high-dimensional structured data, such as genomics and transcriptomics data. Specifically, existing knowledge-guided statistical learning methods often rely on graphs extracted from existing databases or the knowledge of domain experts, which may be incomplete and contain erroneous edges. Therefore, this paper proposes a graph-guided Bayesian modeling framework to handle network noise in high-dimensional structured predictors in regression models. The authors infer the true underlying network structure by combining two sources of network information: noisy graphs extracted from existing databases and graphs estimated from the predictors observed in the current dataset. Additionally, the method employs adaptive structured shrinkage priors for Bayesian regression modeling and develops an efficient Markov Chain Monte Carlo algorithm for posterior sampling. Through simulation studies and the analysis of genomics and proteomics data related to Alzheimer's disease, the advantages of this method over existing methods are demonstrated.

Accounting for network noise in graph-guided Bayesian modeling of structured high-dimensional data

Scalable Bayesian variable selection for structured high‐dimensional data

Information-Theoretic Scoring Rules to Learn Additive Bayesian Network Applied to Epidemiology

High Dimensional Bayesian Network Classification with Network Global-Local Shrinkage Priors

Bayesian Analysis for Exponential Random Graph Models Using the Adaptive Exchange Sampler.

Bayesian graphical models for computational network biology

Joint network and node selection for pathway-based genomic data analysis.

High Dimensional Logistic Regression Under Network Dependence

Bayesian Blind Source Separation for Data with Network Structure

Bayesian Inference of Networks Across Multiple Sample Groups and Data Types

Bayes optimal learning in high-dimensional linear regression with network side information

A modeling framework for detecting and leveraging node-level information in Bayesian network inference

Bayesian Structure Learning in Multi-layered Genomic Networks

Incorporating graph information in Bayesian factor analysis with robust and adaptive shrinkage priors

A Full Bayesian Approach to Sparse Network Inference Using Heterogeneous Datasets

Impact of noise on molecular network inference

Bayesian learning of multiple directed networks from observational data

Bayesian Chain Graph LASSO Models to Learn Sparse Microbial Networks with Predictors

Scalable Bayesian regression in high dimensions with multiple data sources

Graph Structure Learning with Interpretable Bayesian Neural Networks