Abstract:Abstract Background DNA signatures are distinct short nucleotide sequences that provide valuable information that is used for various purposes, such as the design of Polymerase Chain Reaction primers and microarray experiments. Biologists usually use a discovery algorithm to find unique signatures from DNA databases, and then apply the signatures to microarray experiments. Such discovery algorithms require to set some input factors, such as signature length l and mismatch tolerance d, which affect the discovery results. However, suggestions about how to select proper factor values are rare, especially when an unfamiliar DNA database is used. In most cases, biologists typically select factor values based on experience, or even by guessing. If the discovered result is unsatisfactory, biologists change the input factors of the algorithm to obtain a new result. This process is repeated until a proper result is obtained. Implicit signatures under the discovery condition (l, d) are defined as the signatures of length ≤ l with mismatch tolerance ≥ d. A discovery algorithm that could discover all implicit signatures, such that those that meet the requirements concerning the results, would be more helpful than one that depends on trial and error. However, existing discovery algorithms do not address the need to discover all implicit signatures. Results This work proposes two discovery algorithms - the consecutive multiple discovery (CMD) algorithm and the parallel and incremental signature discovery (PISD) algorithm. The PISD algorithm is designed for efficiently discovering signatures under a certain discovery condition. The algorithm finds new results by using previously discovered results as candidates, rather than by using the whole database. The PISD algorithm further increases discovery efficiency by applying parallel computing. The CMD algorithm is designed to discover implicit signatures efficiently. It uses the PISD algorithm as a kernel routine to discover implicit signatures efficiently under every feasible discovery condition. Conclusions The proposed algorithms discover implicit signatures efficiently. The presented CMD algorithm has up to 97% less execution time than typical sequential discovery algorithms in the discovery of implicit signatures in experiments, when eight processing cores are used.

An Incremental Algorithm for Efficient Unique Signature Discoveries on DNA Databases.

A Parallel and Incremental Algorithm for Efficient Unique Signature Discovery on DNA Databases

An Efficient Algorithm for Unique Signature Discovery on Whole-Genome EST Databases

A Fuzzy sequencer for rapid DNA fragment counting and genotyping

Improving the Single Template Method in DNA Computing

O(1.414n) volume molecular solutions for the subset-sum problem on DNA-based supercomputing

Discovering simple DNA sequences by the algorithmic significance method

Dynamic Addressing Molecular Robot (DAMR): An Effective and Efficient Trial‐and‐Error Approach for the Analysis of Single Nucleotide Polymorphisms

Transverse Electronic Signature of DNA for Electronic Sequencing

A novel Algorithm for Solving the Prize Collecting Traveling Salesman Problem based on DNA Computing

DNA Hash Pooling and its Applications

High efficiency error suppression for accurate detection of low-frequency variants

SUA-Based Algorithm for Finding SATRs in DNA Sequence

Yield Improving Strategy for SNP Detection on DNA Origami Chip by Employing Au Nanoparticles

Reconstruction algorithms for DNA-storage systems

Explorer: efficient DNA coding by De Bruijn graph toward arbitrary local and global biochemical constraints

An Efficient DNA‐Fueled Molecular Machine for the Discrimination of Single‐Base Changes

StarSignDNA: Signature tracing for accurate representation of mutational processes

An advanced approach for DNA sequencing and similarities analysis on the basis of groupings of nucleotide bases

Bacterial Foraging Algorithm Based on Activity of Bacteria for DNA Computing Sequence Design

OPUSeq Simplifies Detection of Low-Frequency DNA Variants and Uncovers Fragmentase-Associated Artifacts