Abstract:Extracting meaningful patterns from voluminous amount of biological data is a very big challenge. Motifs are biological patterns of great interest to biologists. Many different versions of the motif finding problem have been identified by researchers. Examples include the Planted $(l, d)$ Motif version, those based on position-specific score matrices, etc. A comparative study of the various motif search algorithms is very important for several reasons. For example, we could identify the strengths and weaknesses of each. As a result, we might be able to devise hybrids that will perform better than the individual components. In this paper we (either directly or indirectly) compare the performance of PMSprune (an algorithm based on the $(l, d)$ motif model) and several other algorithms in terms of seven measures and using well established benchmarks In this paper, we (directly or indirectly) compare the quality of motifs predicted by PMSprune and 14 other algorithms. We have employed several benchmark datasets including the one used by Tompa, <a class="link-external link-http" href="http://et.al" rel="external noopener nofollow">this http URL</a>. These comparisons show that the performance of PMSprune is competitive when compared to the other 14 algorithms tested. We have compared (directly or indirectly) the performance of PMSprune and 14 other algorithms using the Benchmark dataset provided by Tompa, <a class="link-external link-http" href="http://et.al" rel="external noopener nofollow">this http URL</a>. It is observed that both PMSprune and DME (an algorithm based on position-specific score matrices) in general perform better than the 13 algorithms reported in Tompa et. al.. Subsequently we have compared PMSprune and DME on other benchmark data sets including ChIP-Chip, ChIP-seq, and ABS. Between PMSprune and DME, PMSprune performs better than DME on six measures. DME performs better than PMSprune on one measure (namely, specificity).

Reference Sequence Selection for Motif Searches

RefSelect: a Reference Sequence Selection Algorithm for Planted (l, D) Motif Search

An Efficient Exact Algorithm for Planted Motif Search on Large DNA Sequence Datasets

Pairmotif: A New Pattern-Driven Algorithm for Planted (L, D) Dna Motif Search

Efficient Sequential and Parallel Algorithms for Planted Motif Search

A Fast Exact Pattern Matching Algorithm for Biological Sequences

qPMS Sigma -- An Efficient and Exact Parallel Algorithm for the Planted $(l, d)$ Motif Search Problem

An Experimental Comparison of PMSPrune and Other Algorithms for Motif Search

An optimal combined search strategy for motif detection

Identification of degenerate motifs using position restricted selection and hybrid ranking combination.

Finding the transcription factor binding locations using novel algorithm segmentation to filtration (S2F)

Exploring Scalable Parallelization for Edit Distance-Based Motif Search

BioPM:An Efficient Algorithm for Protein Motif Mining

Quick-motif: an Efficient and Scalable Framework for Exact Motif Discovery.

An Optimized Method for Protein Motif Mining

Mining Algorithm for Protein Sequence Pattern

Detecting motifs in DNA sequences by branching from neighbors of qualified potential motifs

Novel algorithms for LDD motif search

PairMotifChIP: A Fast Algorithm for Discovery of Patterns Conserved in Large ChIP-seq Data Sets

An effective algorithm of motif finding problem.

An improved voting algorithm for planted (l, d) motif search