An Incremental Algorithm for Efficient Unique Signature Discoveries on DNA Databases.

Hsiao Ping Lee,Tzu-Fang Sheu,Chuan Yi Tang
DOI: https://doi.org/10.1145/1774088.1774413
2010-01-01
Abstract:DNA signatures are distinct short nucleotide sequences that can be used to detect the presence of certain organisms and to distinguish that organisms from all other species. The signatures provide valuable information for many applications, such as PCR primer designs and microarray experiments. In practice, we use a discovery algorithm to discover unique signatures from DNA databases, and then apply the signatures to microarray experiments. If the discovered result is not satisfying, we will change the parameter settings of the algorithm to get a new result. The process of changing parameter settings may be consecutively performed until a satisfying result is obtained, which is called consequtively multiple discoveries. The situation occurs frequently especially when we handle unfamiliar DNA databases. The challenge is how to accomplish every new discoveries efficiently. The needs of consequtively multiple discoveries are not considered in existing discovery algorithms. In this paper, we propose an incremental algorithm specifically for consecutively multiple discoveries. The algorithm is designed based on the observations on the properties of the signatures. Our algorithm finds out the new result by employing the previously discovered results as candidates rather than performing complete discoveries on the whole database. Since the candidates in the incremental discovery are reduced and limited to the discovered signatures, the discovery process accelerates. Compared with the typical discovery algorithms that perform complete discoveries on a whole database, our incremental algorithm saves at most 87% of the execution time in our experiments.
What problem does this paper attempt to address?