Representing core gene expression activity relationships using the latent structure implicit in bayesian networks

Jiahao Gao,Mark Gerstein
DOI: https://doi.org/10.1093/bioinformatics/btae463
IF: 5.8
2024-07-25
Bioinformatics
Abstract:Abstract Motivation Many types of networks, such as co-expression or ChIP-seq-based gene-regulatory networks, provide useful information for biomedical studies. However, they are often too full of connections and difficult to interpret, forming “indecipherable hairballs”. Results To address this issue, we propose that a Bayesian network can summarize the core relationships between gene expression activities. This network, which we call the LatentDAG, is substantially simpler than conventional co-expression network and ChIP-seq networks (by two orders of magnitude). It provides clearer clusters, without extraneous cross-cluster connections, and clear separators between modules. Moreover, one can find a number of clear examples showing how it bridges the connection between steps in the transcriptional regulatory network and other networks (e.g., RNA-binding protein). In conjunction with a graph neural network (GNN), the LatentDAG works better than other biological networks in a variety of tasks, including prediction of gene conservation and clustering genes. Availability Code is available at https://github.com/gersteinlab/LatentDAG Supplementary information Supplementary data are available at Bioinformatics online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?