Incorporating graph information in Bayesian factor analysis with robust and adaptive shrinkage priors

Qiyiwen Zhang,Changgee Chang,Li Shen,Qi Long

DOI: https://doi.org/10.1093/biomtc/ujad014

IF: 1.701

2024-01-29

Biometrics

Abstract:ABSTRACT There has been an increasing interest in decomposing high-dimensional multi-omics data into a product of low-rank and sparse matrices for the purpose of dimension reduction and feature engineering. Bayesian factor models achieve such low-dimensional representation of the original data through different sparsity-inducing priors. However, few of these models can efficiently incorporate the information encoded by the biological graphs, which has been already proven to be useful in many analysis tasks. In this work, we propose a Bayesian factor model with novel hierarchical priors, which incorporate the biological graph knowledge as a tool of identifying a group of genes functioning collaboratively. The proposed model therefore enables sparsity within networks by allowing each factor loading to be shrunk adaptively and by considering additional layers to relate individual shrinkage parameters to the underlying graph information, both of which yield a more accurate structure recovery of factor loadings. Further, this new priors overcome the phase transition phenomenon, in contrast to existing graph-incorporated approaches, so that it is robust to noisy edges that are inconsistent with the actual sparsity structure of the factor loadings. Finally, our model can handle both continuous and discrete data types. The proposed method is shown to outperform several existing factor analysis methods through simulation experiments and real data analyses.

statistics & probability,mathematical & computational biology,biology

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to effectively integrate biological atlas information in factor analysis of high - dimensional multi - omics data to improve the performance of the model in dimension reduction and feature engineering. Specifically, although existing Bayesian factor models can achieve low - dimensional representations through different sparse priors, few models can efficiently utilize known biological network information. These network information has been proven to be very useful in many analysis tasks, especially in identifying a group of genes that work together. Therefore, this paper proposes a new Bayesian factor model. By introducing novel hierarchical priors, this model can use biological atlas knowledge as a tool to identify genomes that work together, and can adaptively shrink each factor loading. At the same time, it considers additional layers to correlate individual shrinkage parameters with underlying graph information, thereby more accurately recovering the factor loading structure. In addition, this model also overcomes the phase - transition phenomenon existing in existing graph embedding methods, making it more robust to inconsistent noisy edges and suitable for continuous and discrete data types.

Incorporating graph information in Bayesian factor analysis with robust and adaptive shrinkage priors

Sparse Bayesian factor analysis when the number of factors is unknown

Incorporating Biological Knowledge with Factor Graph Neural Network for Interpretable Deep Learning

Scalable Bayesian variable selection for structured high‐dimensional data

Accounting for network noise in graph-guided Bayesian modeling of structured high-dimensional data

A sparse factor model for clustering high‐dimensional longitudinal data

Structured prior distributions for the covariance matrix in latent factor models

Bayesian graph selection consistency under model misspecification

Enhancing Scalability in Bayesian Nonparametric Factor Analysis of Spatiotemporal Data

Integrative analysis of multi-omics and imaging data with incorporation of biological information via structural Bayesian factor analysis

Exponential Family Factors for Bayesian Factor Analysis

Fast Bayesian Factor Analysis via Automatic Rotations to Sparsity

Bayesian Multi-study Factor Analysis for High-throughput Biological Data

Scalable Probabilistic Matrix Factorization with Graph-Based Priors

High-dimensional Factor Analysis for Network-linked Data

A Unified Bayesian Framework for Bi-overlapping-Clustering Multi-omics Data via Sparse Matrix Factorization

A Comparison of Bayesian Inference Techniques for Sparse Factor Analysis

Factor modelling for high-dimensional functional time series

The Infinite Hierarchical Factor Regression Model

A supervised Bayesian factor model for the identification of multi-omics signatures

Bayesian Chain Graph LASSO Models to Learn Sparse Microbial Networks with Predictors