Abstract:BACKGROUND:The regulation of gene expression is complex and occurs at many levels, including transcriptional and post-transcriptional, in metazoans. Transcriptional regulation is mainly determined by sequence elements within the promoter regions of genes while sequence elements within the 3' untranslated regions of mRNAs play important roles in post-transcriptional regulation such as mRNA stability and translation efficiency. Identifying cis-regulatory elements, or motifs, in multicellular eukaryotes is more difficult compared to unicellular eukaryotes due to the larger intergenic sequence space and the increased complexity in regulation. Experimental techniques for discovering functional elements are often time consuming and not easily applied on a genome level. Consequently, computational methods are advantageous for genome-wide cis-regulatory motif detection. To decrease the search space in metazoans, many algorithms use cross-species alignment, although studies have demonstrated that a large portion of the binding sites for the same trans-acting factor do not reside in alignable regions. Therefore, a computational algorithm should account for both conserved and nonconserved cis-regulatory elements in metazoans.RESULTS:We present CompMoby (Comparative MobyDick), software developed to identify cis-regulatory binding sites at both the transcriptional and post-transcriptional levels in metazoans without prior knowledge of the trans-acting factors. The CompMoby algorithm was previously shown to identify cis-regulatory binding sites in upstream regions of genes co-regulated in embryonic stem cells. In this paper, we extend the software to identify putative cis-regulatory motifs in 3' UTR sequences and verify our results using experimentally validated data sets in mouse and human. We also detail the implementation of CompMoby into a user-friendly tool that includes a web interface to a streamlined analysis. Our software allows detection of motifs in the following three categories: one, those that are alignable and conserved; two, those that are conserved but not alignable; three, those that are species specific. One of the output files from CompMoby gives the user the option to decide what category of cis-regulatory element to experimentally pursue based on their biological problem. Using experimentally validated biological datasets, we demonstrate that CompMoby is successful in detecting cis-regulatory target sites of known and novel trans-acting factors at the transcriptional and post-transcriptional levels.CONCLUSION:CompMoby is a powerful software tool for systematic de novo discovery of evolutionarily conserved and nonconserved cis-regulatory sequences involved in transcriptional or post-transcriptional regulation in metazoans. This software is freely available to users at http://genome.ucsf.edu/compmoby/.

Building a Dictionary for Genomes: Identification of Presumptive Regulatory Sites by Statistical Analysis

Regulatory Element Detection Using a Probabilistic Segmentation Model

CompMoby: Comparative MobyDick for Detection of Cis-Regulatory Motifs

Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrepresented upstream motifs

A Suite of Web-Based Programs to Search for Transcriptional Regulatory Motifs

Identification of the Binding Sites of Regulatory Proteins in Bacterial Genomes

A computational approach to regulatory element discovery in eukaryotes

Correlating overrepresented upstream motifs to gene expression: a computational approach to regulatory element discovery in eukaryotes

Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome

Identification of degenerate motifs using position restricted selection and hybrid ranking combination.

MotifMark: Finding regulatory motifs in DNA sequences

Identification of Functional Elements and Regulatory Circuits by Drosophila Modencode

Inference of Combinatorial Regulation in Yeast Transcriptional Networks: A Case Study of Sporulation

Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site

A Mutation Degree Model for the Identification of Transcriptional Regulatory Elements

A Bag-Of-Motif Model Captures Cell States at Distal Regulatory Sequences

Genome-Wide Discovery of Modulators of Transcriptional Interactions in Human B Lymphocytes

CompleteMOTIFs: DNA Motif Discovery Platform for Transcription Factor Binding Experiments

Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals

Regulatory Element Detection Using Correlation with Expression

Discovery and information-theoretic characterization of transcription factor binding sites that act cooperatively