DNA Sequence Motif Discovery Based on Kd-Trees and Genetic Algorithm.

Qiang Zhang,Shouhang Wu,Changjun Zhou,Xuedong Zheng
DOI: https://doi.org/10.1007/978-3-642-37502-6_98
2013-01-01
Abstract:In the post-genomics era, recognition of transcription factor binding sites (DNA motifs) to help with understanding the regulation of gene is one of the major challenges. An improved algorithm for motif discovery in DNA sequence based on Kd-Trees and Genetic Algorithm (KTGA) is proposed in this paper. Firstly, we use Kd-Trees to stratify the input DNA sequences, and pick out subsequences with the highest scoring of the hamming distance from each layer which constitute the initial population. Then, genetic algorithm is used to find the true DNA sequence motif. The experiment performing on synthetic data and biological data shows that the algorithm not only can be applied to each sequence containing one motif or multiple motifs, but also improve the performance of genetic algorithm at finding DNA motif.
What problem does this paper attempt to address?