Using high-throughput multi-omics data to investigate structural balance in elementary gene regulatory network motifs

Alberto Zenere,Olof Rundquist,Mika Gustafsson,Claudio Altafini
DOI: https://doi.org/10.1093/bioinformatics/btab577
IF: 5.8
2021-08-12
Bioinformatics
Abstract:Abstract Motivation The simultaneous availability of ATAC-seq and RNA-seq experiments allows to obtain a more in-depth knowledge on the regulatory mechanisms occurring in gene regulatory networks. In this article, we highlight and analyze two novel aspects that leverage on the possibility of pairing RNA-seq and ATAC-seq data. Namely we investigate the causality of the relationships between transcription factors, chromatin and target genes and the internal consistency between the two omics, here measured in terms of structural balance in the sample correlations along elementary length-3 cycles. Results We propose a framework that uses the a priori knowledge on the data to infer elementary causal regulatory motifs (namely chains and forks) in the network. It is based on the notions of conditional independence and partial correlation, and can be applied to both longitudinal and non-longitudinal data. Our analysis highlights a strong connection between the causal regulatory motifs that are selected by the data and the structural balance of the underlying sample correlation graphs: strikingly, >97% of the selected regulatory motifs belong to a balanced subgraph. This result shows that internal consistency, as measured by structural balance, is close to a necessary condition for 3-node regulatory motifs to satisfy causality rules. Availability and implementation The analysis was carried out in MATLAB and the code can be found at https://github.com/albertozenere/Multi-omics-elementary-regulatory-motifs. Supplementary information Supplementary data are available at Bioinformatics online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?