Models of Genome Evolution
Y Zhou,B Mishra
DOI: https://doi.org/10.1007/978-3-642-18734-6_13
2004-01-01
Abstract:The evolutionary theory, "evolution by duplication", originally proposed by Susumu Ohno in 1970, can now be verified with the available genome sequences. Recently, several mathematical models have been proposed to explain the topology of protein interaction networks that have also implemented the idea of "evolution by duplication". The power law distribution with its "hubby" topology (e.g., P53 was shown to interact with an unusually large number of other proteins) can be explained if one makes the following assumption: new proteins, which are duplicates of older proteins, have a propensity to interact only with the same proteins as their evolutionary predecessors. Since protein interaction networks, as well as other higher-level cellular processes, are encoded in genomic sequences, the evolutionary structure, topology, and statistics of many biological objects (pathways, phylogeny, symbiotic relations, etc.) are rooted in the evolution dynamics of the genome sequences. Susumu Ohno's hypothesis can be tested "in silico" using Polya's urn model. In our model, each basic DNA sequence change is modelled using several probability distribution functions. The functions can decide the insert ion/deletion positions of the DNA fragments, the copy numbers of the inserted fragments, and the sequences of the inserted /deleted pieces. Moreover, those functions can be interdependent. A mathematically tractable model can be created with a directed graph representation. Such graphs are Eulerian and each possible Eulerian path encodes a genome. Every "genome duplication" event evolves these Eulerian graphs, and the probability distributions and their dynamics themselves give rise to many intriguing and elegant mathematical problems. In this chapter, we explore and survey these connections between biology, mathematics and computer science in order to reveal simple, and yet deep, models of life itself.