Integrative Inference of Transcriptional Networks in Arabidopsis
I. D. Clercq,J. D. Velde,Xiaopeng Luo,Li Liu,V. Storme,1 Robin,Pottie,Dries Vaneechoutte,F. Breusegem,K. Vandepoele
DOI: https://doi.org/10.1101/2020.08.11.245902
2020-01-01
bioRxiv
Abstract:18 Gene regulation is a dynamic process in which transcription factors (TFs) play an important 19 role to control spatiotemporal gene expression. While gene regulatory networks describe the 20 interactions between TFs and their target genes, our global knowledge about the complexity of 21 TF control for different genes and biological processes is incomplete. To enhance our global 22 understanding of regulatory interactions in Arabidopsis thaliana, different regulatory input 23 networks capturing complementary information about DNA motifs, open chromatin, TF binding 24 and expression-based regulatory interactions, were combined using a supervised learning 25 approach, resulting in an integrated gene regulatory network (iGRN) covering 1,491 TFs and 26 31,393 target genes (1.7 million interactions). This iGRN outperforms the different input 27 networks to predict known regulatory interactions and has a similar performance to recover 28 functional interactions compared to state-of-the-art experimental methods like yeast one29 hybrid and ChIP-seq. The iGRN correctly inferred known functions for 681 TFs and predicted new 30 gene functions for hundreds of unknown TFs. For regulators predicted to be involved in reactive 31 oxygen species stress regulation, we confirmed in total 75% of TFs with a function in ROS and/or 32 physiological stress responses. This includes 13 novel ROS regulators, previously not connected 33 to any ROS or stress function, that were experimentally validated in our ROS-specific phenotypic 34 assays of lossor gain-of-function lines. In conclusion, the presented iGRN offers a high-quality 35 starting point to enhance our understanding of gene regulation in plants by integrating different 36 experimental data types at the network level. 37