Prokaryote Phylogeny Without Sequence Alignment: from Avoidance Signature to Composition Distance.

BL Hao,J Qi
DOI: https://doi.org/10.1109/csb.2003.1227338
2004-01-01
Journal of Bioinformatics and Computational Biology
Abstract:This is a review of a new and essentially simple method of inferring phylogenetic relationships from complete genome data without using sequence alignment. The method is based on counting the appearance frequency of oligopeptides of a fixed length (up to K = 6) in the collection of protein sequences of a species. It is a method without fine adjustment and choice of genes. Applied to prokaryotic genomes it has led to results comparable with the bacteriologists' systematics as reflected in the latest 2002 outline of the Bergey's Manual of Systematic Bacteriology. The method has also been used to compare chloroplast genomes and to the phylogeny of Coronaviruses including human SARS-CoV. A key point in our approach is subtraction of a random background from the original counts by using a Markov model of order K-2 in order to highlight the shaping role of natural selection. The implications of the subtraction procedure is specially analyzed and further development of the new approach is indicated.
What problem does this paper attempt to address?