On the Mkv Model with Among-Character Rate Variation

Alessio Capobianco,Sebastian Hoehna
DOI: https://doi.org/10.1101/2024.11.15.623796
2024-11-17
Abstract:Models used in likelihood-based morphological phylogenetics often adapt molecular phylogenetics models to the specificities of morphological data. Such is the case for the widely used Mkv model---which introduces an acquisition bias correction for sampling only characters that are observed to be variable---and for models of among-character rate variation (ACRV), routinely applied by researchers to relax the equal-rates assumption of Mkv. However, the interaction between variable character acquisition bias and ACRV has never been explored before. We demonstrate that there are two distinct approaches to condition the likelihood on variable characters when there is ACRV, and we call them joint and marginal acquisition bias. Far from being just a trivial mathematical detail, we show that the way in which the variable character conditional likelihood is calculated results in different assumptions about how rate variation is distributed in morphological datasets. Simulations demonstrate that tree length and amount of ACRV in the data are systematically biased when conditioning on variable characters differently from how the data was simulated. Moreover, an empirical case study with extant and extinct taxa reveals a potential impact not only on the estimation of branch lengths, but also of phylogenetic relationships. We recommend the use of the marginal acquisition bias approach for morphological datasets modeled with ACRV. Finally, we urge developers of phylogenetic software to clarify which acquisition bias correction is implemented for both estimation and simulation, and we discuss the implications of our findings on modeling variable characters for the future of morphological phylogenetics.
Evolutionary Biology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to correctly handle the interaction between the acquisition bias of variable characters and the among - character rate variation (ACRV) in morphological phylogenetic analysis. Specifically, the paper explores two different methods for adjusting the likelihood calculation based on variable characters, namely joint acquisition bias and marginal acquisition bias, and how these two methods affect the estimation of tree length and the amount of ACRV in the presence of ACRV. ### Background - **Mkv model**: This is a widely used model in morphological phylogenetic analysis, which introduces acquisition bias correction by sampling only the observed variable characters. - **ACRV**: To relax the equal - rate assumption in the Mkv model, researchers usually apply the ACRV model, which allows different evolutionary rates among different characters. ### Research Objectives - **Explore the interaction between acquisition bias and ACRV**: The paper explores for the first time the interaction between acquisition bias and ACRV and shows the impact of this interaction on the estimation of tree length and the amount of ACRV. - **Propose two different conditional likelihood calculation methods**: - **Joint acquisition bias**: Adjust the likelihood for each rate category separately. - **Marginal acquisition bias**: Adjust the likelihood after integrating over the entire rate category. ### Main Findings - **Results of simulation studies**: - When simulating both variable and invariant characters simultaneously, the mMkv+Γ model is significantly superior to the jMkv+Γ model in correctly estimating tree length and the amount of ACRV. - When the data are generated only under joint acquisition bias or marginal acquisition bias, the corresponding models (jMkv+Γ and mMkv+Γ respectively) perform best in estimating tree length and the amount of ACRV. - When acquisition bias is not considered, the tree - length estimate is similar to that of the jMkv+Γ model, but the amount of ACRV is severely underestimated, especially when the data set contains only variable characters. ### Conclusions - **Recommend the use of marginal acquisition bias correction**: The paper recommends using the marginal acquisition bias correction method when dealing with morphological data sets with ACRV. - **Software developers need to clearly state the implemented acquisition bias correction method**: The author calls on developers of phylogenetic software to clearly state the acquisition bias correction method implemented in their software, so that users can better understand and apply these models. ### Formula Summary - **Conditional likelihood of the Mkv model**: \[ L_{\text{Mkv}}(D_i)=\frac{P(D_i,\text{Var}\mid\tau,\Lambda)}{1 - P(\neg\text{Var}\mid\tau,\Lambda)} \] - **Conditional likelihood under multiple rate categories**: - **Joint acquisition bias correction**: \[ L_{\text{jMkv+Γ}}=\prod_{i = 1}^n\sum_{j = 1}^r\frac{P(D_i\mid\tau,\Lambda,\rho_j)\times P(\rho_j)}{1 - P(\neg\text{Var}\mid\tau,\Lambda,\rho_j)} \] - **Marginal acquisition bias correction**: \[ L_{\text{mMkv+Γ}}=\prod_{i = 1}^n\frac{\sum_{j = 1}^r P(D_i\mid\tau,\Lambda,\rho_j)\times P(\rho_j)}{\sum} \]