Clustering Species by Extracting Feature of DNA Sequences Using DFT Transformation

Pan CHANG,Cheng ZHONG
DOI: https://doi.org/10.3969/j.issn.1000-1220.2018.03.011
2018-01-01
Abstract:Discrete Fourier Transformation(DFT) is used to reveal the hidden information in DNA sequence without loss of informa-tion,the three biological characteristics of categories,content and position of subsequences in DNA sequence are mined,the feature vector of equal length is extracted from DNA sequences with arbitrary length,the DNA sequence similarity is computed by Euclidean distance,and an improved alignment-free DNA sequence similarity computation algorithm called AFCS DFT,which is applied to clus-ter species,is proposed.The experimental results show that compared to existing methods,AFCS DFT algorithm can compute more accurate similarities of DNA sequences,construct accurately the phylogenetic trees by clustering species using the similarities,the con-strued phylogenetic trees can reflect the features of species cluster,and these features reveal that the closer evolutionary level the spe-cies,the more similar their DNA sequences.
What problem does this paper attempt to address?