Coral-M: Heuristic Coding Region Alignment Method For Multiple Genome Sequences

chelun hung,chunyuan lin,shihcheng chang,yehching chung,shu ju hsieh,chuan yi tang,yawling lin
DOI: https://doi.org/10.1109/BIBMW.2010.5703803
2010-01-01
Abstract:Multiple sequence alignment is a scientific tool to assist the study of DNA homology, phylogeny determinations, and conserved motifs identification. Various heuristic MSA methods have been presented to obtain the resulting alignment for multiple sequences. Although these alignment tools are able to align protein, DNA, and RNA sequences successfully, they are not such successful in aligning coding region sequences because the resulting alignments maybe not consistent with practical observations. Therefore, we propose a method, CORAL-M, a heuristic coding regions alignment method for multiple genome sequences, especially for coding regions. CORAL-M adopts a probabilistic filtration model and the local optimal solution to align genome sequences ( codon to codon with the wobble mask rule) by the sliding windows and, thus, obtains the near-optimal alignment in linear time. In the experimental results, CORAL-M can be used to find the potential function sites by aligning viral strains of Poliovirus 1-3, Enterovirus 71, and Coxsackievirus 16.
What problem does this paper attempt to address?