On the Alignment Space and Its Applications

Shi-Yi Shen,Kui Wang,Gang Hu,Shu-Tao Xia
DOI: https://doi.org/10.1109/itw2.2006.323780
2006-01-01
Abstract:The classic error-correcting codes are designed to correct substitution errors, in which the original symbol is replaced by a different symbol. The so called generalized errors include not only substitutions of symbols but also deletions and/or insertions of symbols. In the areas of computer science, information theory, cryptography and bioinformatics, sequences with generalized errors are discussed respectively for different aims. We describe generalized errors by mutation model which comes from bioinformatics. To deal with the sequences with generalized errors, we also need some type of metric to measure the distance of sequences with various lengths. The alignment distance is defined as the distance between the two sequences with generalized errors. The algorithms to calculate the alignment distance were studied in bioinformatics recently, such as Smith-Waterman dynamic programming algorithm and SPA algorithm. Under the alignment distance, we give the definition of the alignment space. To strictly describe the structure of the sequences with generalized errors, modular structure theory is introduced. A new and more strict proof of triangle inequality for alignment distance is thus given. As applications of the alignment space, the definition and construction of generalized error-correcting codes are studied in this paper. Some optimal codes with small length are listed. The error-correcting capability of random codes with large length is also presented. We discuss Fault-tolerance complex in cryptography as another application of the alignment space at the end of the paper.
What problem does this paper attempt to address?