Dynamic Bayesian Network (DBN) with Structure Expectation Maximization (SEM) for Modeling of Gene Network from Time Series Gene Expression Data
Yu Zhang,Zhidong Deng,Hongshan Jiang,Peifa Jia
2006-01-01
Abstract:Exploring gene regulatory network is a key topic in molecular biology. In this paper, we present a new dynamic Bayesian network (DBN) framework embedded with structural expectation maximization (SEM) to model gene relationship. It is well-suited for analyzing the time-series data and can deal with cyclical structures that can not be tackled by static Bayesian network. We applied the new method to learning the regulatory network and the metabolic pathway from Saccharomyces Cerevisiae cell cycle gene expression data. The results show that the proposed method is capable of handling missing values in expression data sets, and the inference accuracy can further be improved. Keyword: Microarrays; Gene regulatory networks; Dynamic Bayesian network; Structural expectation maximization 1. Introduction The establishment of gene regulatory network is critical to the understanding of the genetic regulation process. This problem has become an important challenge in recent years. The invention of microarray technology is viewed as a milestone, which helps scientist measure expression levels of thousands of genes simultaneously. Several methods have been presented so far to learn gene network from microarray data, such as Boolean networks [2, 3], differential equations [4, 5], and Bayesian networks [6-8]. Among all of them, Bayesian network based approach has received a lot of attention because of the probabilistic nature of this model. It can be used to learn causal relationship and particularly, combine it with prior knowledge readily. However, there exist a lot of shortcomings for the static version of Bayesian network. First, it is unable to capture the temporal information. Second, it is impossible to model cyclic network, which is often considered to be an accurate description of real gene regulation mechanism. In this paper, we employ dynamic Bayesian network (DBN) [1, 9,10], instead of its earlier static version, to model a gene network with cyclic regulation. In general, the DBN is well-suited for characterizing time-series gene expression data. Owing to the limitation of experimental condition, there are many missing values in the gene expression data sets, which usually have an impact on the inference accuracy. To address this problem, we propose a new DBN model embedded with