Abstract:BACKGROUND:The regulation of gene expression is complex and occurs at many levels, including transcriptional and post-transcriptional, in metazoans. Transcriptional regulation is mainly determined by sequence elements within the promoter regions of genes while sequence elements within the 3' untranslated regions of mRNAs play important roles in post-transcriptional regulation such as mRNA stability and translation efficiency. Identifying cis-regulatory elements, or motifs, in multicellular eukaryotes is more difficult compared to unicellular eukaryotes due to the larger intergenic sequence space and the increased complexity in regulation. Experimental techniques for discovering functional elements are often time consuming and not easily applied on a genome level. Consequently, computational methods are advantageous for genome-wide cis-regulatory motif detection. To decrease the search space in metazoans, many algorithms use cross-species alignment, although studies have demonstrated that a large portion of the binding sites for the same trans-acting factor do not reside in alignable regions. Therefore, a computational algorithm should account for both conserved and nonconserved cis-regulatory elements in metazoans.RESULTS:We present CompMoby (Comparative MobyDick), software developed to identify cis-regulatory binding sites at both the transcriptional and post-transcriptional levels in metazoans without prior knowledge of the trans-acting factors. The CompMoby algorithm was previously shown to identify cis-regulatory binding sites in upstream regions of genes co-regulated in embryonic stem cells. In this paper, we extend the software to identify putative cis-regulatory motifs in 3' UTR sequences and verify our results using experimentally validated data sets in mouse and human. We also detail the implementation of CompMoby into a user-friendly tool that includes a web interface to a streamlined analysis. Our software allows detection of motifs in the following three categories: one, those that are alignable and conserved; two, those that are conserved but not alignable; three, those that are species specific. One of the output files from CompMoby gives the user the option to decide what category of cis-regulatory element to experimentally pursue based on their biological problem. Using experimentally validated biological datasets, we demonstrate that CompMoby is successful in detecting cis-regulatory target sites of known and novel trans-acting factors at the transcriptional and post-transcriptional levels.CONCLUSION:CompMoby is a powerful software tool for systematic de novo discovery of evolutionarily conserved and nonconserved cis-regulatory sequences involved in transcriptional or post-transcriptional regulation in metazoans. This software is freely available to users at http://genome.ucsf.edu/compmoby/.

Identification of degenerate motifs using position restricted selection and hybrid ranking combination.

Searching Maximal Degenerate Motifs Guided by a Compact Suffix Tree.

Freduce: Detection of Degenerate Regulatory Elements Using Correlation with Expression.

A Mutation Degree Model for the Identification of Transcriptional Regulatory Elements

MotifMark: Finding regulatory motifs in DNA sequences

An Integrative and Applicable Phylogenetic Footprinting Framework for Cis-Regulatory Motifs Identification in Prokaryotic Genomes

A Suite of Web-Based Programs to Search for Transcriptional Regulatory Motifs

Motto: Representing Motifs in Consensus Sequences with Minimum Information Loss

Eukaryotic Regulatory Element Conservation Analysis and Identification Using Comparative Genomics

C-Reduce: Incorporating Sequence Conservation to Detect Motifs That Correlate with Expression

Cross-platform DNA motif discovery and benchmarking to explore binding specificities of poorly studied human transcription factors

CompMoby: Comparative MobyDick for Detection of Cis-Regulatory Motifs

Using ChIPMotifs for De Novo Motif Discovery of OCT4 and ZNF263 Based on ChIP-based High-Throughput Experiments.

Systematic Identification of Conserved Motif Modules in the Human Genome

Using Weeder, Pscan, and PscanChIP for the Discovery of Enriched Transcription Factor Binding Site Motifs in Nucleotide Sequences

CompleteMOTIFs: DNA Motif Discovery Platform for Transcription Factor Binding Experiments

Evolutionary optimization of transcription factor binding motif detection

MotifHub: Detection of trans-acting DNA motif group with probabilistic modeling algorithm

Regulatory Element Detection Using a Probabilistic Segmentation Model

Finding the transcription factor binding locations using novel algorithm segmentation to filtration (S2F)

Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrepresented upstream motifs