Identification and Measurement of Neighbor Dependent Nucleotide Substitution Processes

Peter F. Arndt,Terence Hwa
DOI: https://doi.org/10.48550/arXiv.q-bio/0501018
2005-01-13
Abstract:The presence of neighbor dependencies generated a specific pattern of dinucleotide frequencies in all organisms. Especially, the CpG-methylation-deamination process is the predominant substitution process in vertebrates and needs to be incorporated into a more realistic model for nucleotide substitutions. Based on a general framework of nucleotide substitutions we develop a method that is able to identify the most relevant neighbor dependent substitution processes, measure their strength, and judge their importance to be included into the modeling. Starting from a model for neighbor independent nucleotide substitution we successively add neighbor dependent substitution processes in the order of their ability to increase the likelihood of the model describing given data. The analysis of neighbor dependent nucleotide substitutions in human, zebrafish and fruit fly is presented. A web server to perform the presented analysis is publicly available.
Genomics
What problem does this paper attempt to address?