A distance-based model for convergent evolution

Barbara Holland,Katharina T. Huber,Vincent Moulton
DOI: https://doi.org/10.1007/s00285-023-02038-9
2024-01-20
Journal of Mathematical Biology
Abstract:Convergent evolution is an important process in which independent species evolve similar features usually over a long period of time. It occurs with many different species across the tree of life, and is often caused by the fact that species have to adapt to similar environmental niches. In this paper, we introduce and study properties of a distance-based model for convergent evolution in which we assume that two ancestral species converge for a certain period of time within a collection of species that have otherwise evolved according to an evolutionary clock. Under these assumptions it follows that we obtain a distance on the collection that is a modification of an ultrametric distance arising from an equidistant phylogenetic tree. As well as characterising when this modified distance is a tree metric, we give conditions in terms of the model's parameters for when it is still possible to recover the underlying tree and also its height, even in case the modified distance is not a tree metric.
mathematical & computational biology,biology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to model and identify convergent evolution between species through distance data. Specifically, the author proposes a distance - based model to describe the process of convergent evolution that occurs between two ancestral species within a certain period of time, and explores how to recover the underlying phylogenetic tree and its height from the modified distance data under this model. ### Background and Problems of the Paper **Background**: - **Convergent evolution**: Different species independently evolve similar characteristics under similar environmental conditions, which is an important process in evolutionary biology. - **Phylogenetic tree**: It is usually used to represent the evolutionary history of a group of species or taxonomic units. It is a weighted rooted tree, and its leaf nodes correspond to the studied species. - **Standard assumption**: Most phylogenetic models assume that once species divergence occurs, the descendant species will diverge from each other conditionally and independently at a relatively constant rate. This assumption leads to the fact that the evolutionary distance between species is proportional to the time of their most recent common ancestor. - **Non - standard processes**: In the actual biological evolution process, there are some situations that do not conform to the above assumptions, such as virus recombination, horizontal gene transfer in bacteria, and hybridization and introgression in plants and animals. **Problems**: - **Modeling of convergent evolution**: Existing convergent evolution models mainly focus on the changes in molecular sequences or morphological characteristics, but few people model convergent evolution from the perspective of distance data. - **Utilization of distance data**: How to recover the underlying phylogenetic tree and its height from the modified distance data, even if these distance data no longer conform to the standard tree metric. ### Methods and Contributions of the Paper **Methods**: - **Model assumption**: Assume that two ancestral species undergo convergent evolution within a certain period of time, while other species evolve according to the standard evolutionary clock. This results in a modified ultrametric distance. - **Mathematical description**: Define a new distance \(d'\), which is a modified version of the original distance \(d\), where \(d\) is the distance calculated from the phylogenetic tree \(T\). For species \(x\) and \(y\) located on the convergent path, the modified distance \(d'(x, y)\) will subtract a value proportional to the convergent time. **Contributions**: - **Model characteristics**: Study the properties of the modified distance \(d'\), especially whether it is still a tree metric. - **Recovery conditions**: Give the conditions for recovering the underlying phylogenetic tree and its height from the modified distance \(d'\). - **Theoretical results**: Prove that in some cases, even if \(d'\) is no longer a tree metric, the topological structure of the system can still be recovered. ### Key Conclusions - **Distance model of convergent evolution**: By adjusting the distances on the phylogenetic tree, the convergent evolution process can be effectively modeled. - **Recovery conditions**: Propose specific mathematical conditions so that the topological structure and height of the system can still be recovered from the modified distance data even under the influence of convergent evolution. - **Theoretical significance**: Provide a new perspective for understanding the impact of convergent evolution on phylogenetic reconstruction and a theoretical basis for future research. In conclusion, this paper fills this gap in the field by proposing a distance - based convergent evolution model and provides new tools and methods for phylogenetic analysis.