In Silico Genome-Genome Hybridization Values Accurately and Precisely Predict Empirical DNA-DNA Hybridization Values for Classifying Prokaryotes

Paul A. Muller,Slava S. Epstein,Paul A. Muller Jr.
DOI: https://doi.org/10.48550/arXiv.1202.5211
IF: 4.31
2012-02-23
Genomics
Abstract:For nearly 50 years microbiologists have been determining prokaryotic genome relatedness by means of nucleic acid reassociation kinetics. These methods, however, are technically challenging, difficult to reproduce, and - given the time and resources it takes to generate a single data-point - not cost effective. In the post genomic era, with the cost of sequencing whole prokaryotic genomes no longer a limiting factor, we believed that computationally predicting the output value from a traditional DNA-DNA hybridization experiment using pair-wise comparisons of whole genome sequences to be of value. While other computational whole-genome classification methods exist, they predict values on widely different scales than DNA-DNA hybridization, introducing yet another metric into the polyphasic approach of defining microbial species. Our goal was to develop an in silico BLAST based pipeline that would predict with a high level of certainty the value of the wet lab-based DNA-DNA hybridization values. Here we report on one such method that produces estimates that are both accurate and precise with respect to the DNA-DNA hybridization values they are designed to emulate.
What problem does this paper attempt to address?