Large-scale dynamic gene regulatory network inference combining differential equation models with local dynamic Bayesian network analysis

Zheng Li,Ping Li,Arun Krishnan,Jingdong Liu
DOI: https://doi.org/10.1093/bioinformatics/btr454
IF: 5.8
2011-08-04
Bioinformatics
Abstract:MOTIVATION: Reverse engineering gene regulatory networks, especially large size networks from time series gene expression data, remain a challenge to the systems biology community. In this article, a new hybrid algorithm integrating ordinary differential equation models with dynamic Bayesian network analysis, called Differential Equation-based Local Dynamic Bayesian Network (DELDBN), was proposed and implemented for gene regulatory network inference.RESULTS: The performance of DELDBN was benchmarked with an in vivo dataset from yeast. DELDBN significantly improved the accuracy and sensitivity of network inference compared with other approaches. The local causal discovery algorithm implemented in DELDBN also reduced the complexity of the network inference algorithm and improved its scalability to infer larger networks. We have demonstrated the applicability of the approach to a network containing thousands of genes with a dataset from human HeLa cell time series experiments. The local network around BRCA1 was particularly investigated and validated with independent published studies. BRAC1 network was significantly enriched with the known BRCA1-relevant interactions, indicating that DELDBN can effectively infer large size gene regulatory network from time series data.AVAILABILITY: The R scripts are provided in File 3 in Supplementary Material.CONTACT: zheng.li@monsanto.com; jingdong.liu@monsanto.comSUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenge of reverse - engineering large - scale gene regulatory networks (GRNs) from time - series gene expression data in systems biology. Specifically, the paper proposes a new algorithm named DELDBN (Differential Equation - based Local Dynamic Bayesian Network), aiming to improve the accuracy and scalability of inferring gene regulatory networks from time - series data. The main contributions of the paper are as follows: 1. **Improving accuracy**: By combining the ordinary differential equation (ODE) model and dynamic Bayesian network (DBN) analysis, DELDBN shows higher accuracy and sensitivity on the in - vivo yeast data set, which is significantly improved compared with other methods. 2. **Enhancing scalability**: By implementing a local causal discovery algorithm, DELDBN reduces the complexity of the network inference algorithm and improves its ability to handle large - scale networks. The paper demonstrates the application effect of this method on human HeLa cell time - series data containing thousands of genes. 3. **Verifying the BRCA1 network**: Specifically, the local network around the BRCA1 gene is studied and verified by independently published studies. The results show that many known BRCA1 - related interactions are enriched in the BRCA1 network, indicating that DELDBN can effectively infer large - scale gene regulatory networks from time - series data. ### Specific problem - solving methods - **Model building**: - Use the ordinary differential equation (ODE) model to represent the dynamic changes of gene expression: \[ \frac{dX_i(t)}{dt}=\sum_{j}\beta_{ij}X_j(t) \] where \(X_i(t)\) and \(X_j(t)\) represent the expression levels of gene \(i\) and gene \(j\) at time \(t\), respectively, and \(\beta_{ij}\) represents the influence of gene \(j\) on gene \(i\). - **Local causal discovery**: - Determine the local neighborhood by identifying the Markov blanket of the target variable, thereby reducing the number of independence tests and improving computational efficiency. - **Time - series data processing**: - For data with a short time interval (such as yeast data sampled every 10 minutes), use the transcription rate to infer gene regulatory relationships: \[ \frac{X_i(t + 1)-X_i(t)}{\Delta t}=\sum_{j}\beta_{ij}X_j(t) \] - For data with a long time interval (such as data sampled once an hour), use an autoregressive model to replace the ODE model: \[ X_i(t + 1)=\sum_{j}\beta_{ij}X_j(t) \] ### Experimental results - **Yeast IRMA network**: - Using the DELDBN algorithm to infer the yeast IRMA network, the results show high accuracy and sensitivity (PPV = 0.7, SE = 0.875), which is better than other methods. - **Human HeLa cell cycle data**: - Apply the DELDBN algorithm to infer HeLa cell cycle data containing thousands of genes, and specifically study the local network around the BRCA1 gene to verify the effectiveness of this method. ### Conclusion The DELDBN algorithm proposed in the paper performs well in handling large - scale gene regulatory networks, especially in inferring gene regulatory relationships in time - series data. This method not only improves accuracy but also enhances scalability, providing a new tool for inferring large - scale gene regulatory networks.