Constrained Pairwise and Center-Star Sequences Alignment Problems
Yong Zhang,Joseph Wun-Tat Chan,Francis Y. L. Chin,Hing-Fung Ting,Deshi Ye,Feng Zhang,Jianyu Shi
DOI: https://doi.org/10.1007/s10878-015-9914-6
2015-01-01
Journal of Combinatorial Optimization
Abstract:Sequence alignment is a fundamental problem in computational biology, which is also important in theoretical computer science. In this paper, we consider the problem of aligning a set of sequences subject to a given constrained sequence. Given two sequences \(A=a_1a_2\ldots a_n\) and \(B=b_1b_2\ldots b_n\) with a given distance function and a constrained sequence \(C=c_1c_2\ldots c_k\), our goal is to find the optimal sequence alignment of A and B w.r.t. the constraint C. We investigate several variants of this problem. If \(C=c^k\), i.e., all characters in C are same, the optimal constrained pairwise sequence alignment can be solved in \(O(\min \{kn^2,(t-k)n^2\})\) time, where t is the minimum number of occurrences of character c in A and B. If in the final alignment, the alignment score between any two consecutive constrained characters is upper bounded by some value, which is called GB-CPSA, we give a dynamic programming with the time complexity \(O(kn^4/\log n)\). For the constrained center-star sequence alignment (CCSA), we prove that it is NP-hard to achieve the optimal alignment even over the binary alphabet. Furthermore, we show a negative result for CCSA, i.e., there is no polynomial-time algorithm to approximate the CCSA within any constant ratio.