Similarity Analysis of DNA Sequences Based on a Compact Representation

Zhujin Zhang,Shuo Wang,Xingyi Zhang,Zheng Zhang
DOI: https://doi.org/10.1109/bicta.2010.5645092
2010-01-01
Abstract:Randić et al. proposed a significant graphical representation for DNA sequences, which is very compact and avoids loss of information. In this paper, we build a fast algorithm for this graphical representation with time complexity O(n2), and find another important advantage in the representation: no degeneracy. Moreover, we propose a new method to do similarity analysis of DNA sequences based on the representation. The approach adopts four elements of covariance matrix as a descriptor, and is illustrated on the first exon of beta-globin genes from 11 different species.
What problem does this paper attempt to address?