A Fast Longest Common Subsequence Algorithm for Biosequences Alignment

Wei Liu,Lin Chen
DOI: https://doi.org/10.1007/978-0-387-77251-6_8
2008-01-01
Abstract:Searching for the longest common substring (LCS) of biosequences is one of the most important tasks in Bioinformatics. A fast algorithm for LCS problem named FAST_LCS is presented. The algorithm first seeks the successors of the initial identical character pairs according to a successor table to obtain all the identical pairs and their levels. Then by tracing back from the identical character pair at the largest level, the result of LCS can be obtained. For two sequences X and Y with lengths n and in, the memory required for FAST_LCS is max{8*(n+1)+8*(m+1),L}, here L is the number of identical character pairs and time complexity of parallel implementation is O(ILCS(X,Y)I), here, ILCS(X,Y)I is the length of the LCS of X,Y. Experimental result on the gene sequences of tigr database using MPP parallel computer Shenteng 1800 shows that our algorithm can get exact correct result and is faster and more efficient than other LCS algorithms.
What problem does this paper attempt to address?