A Combined Model and a Varied Gibbs Sampling Algorithm Used for Motif Discovery.

Xiaoming Wu,Bo Wang,Changxin Song,Jingzhi Cheng
2004-01-01
Abstract:The conserved sequences in gene regulatory regions dominate gene regulation. Discovering these sequences and their functions is important in post genome era. A novel model is constructed to represent conserved motifs of DNA sequences. This model is a combination of PWM and WAM models. The advantage is the new model not only can comprise individual base frequencies in the motifs, but also can embody relationship of neighbourhood bases. In addition, a varied Gibbs sampling algorithm is applied with consideration of the different motif occurrences in each sequence. This variation is more accordant with the true situation of gene transcription controlling mechanism. By combining the model and the discovery algorithm, a program is constructed. After analysed a set of DNA sequences of upstream regions of genes using this program, putative motifs are discovered and are compared to experimental verified regulatory sequences. Results showed that this combination is ideal for motif discovery and the practice is meaningful for gene regulation research.
What problem does this paper attempt to address?