CoLaML: Inferring latent evolutionary modes from heterogeneous gene content

Shun Yamanouchi,Tsukasa Fukunaga,Wataru Iwasaki
DOI: https://doi.org/10.1101/2024.12.02.626417
2024-12-05
Abstract:Motivation: Estimating the history of gene content evolution provides insights into genome evolution on a macroevolutionary timescale. Previous models did not consider heterogeneity in evolutionary patterns among gene families across different periods and/or clades. Results: We introduce CoLaML (joint inference of gene COntent evolution and its LAtent modes using Maximum Likelihood), which considers heterogeneity using a Markov-modulated Markov chain. This model assumes that internal states determine evolutionary patterns (i.e., latent evolutionary modes) and attributes heterogeneity to their switchover during the evolutionary timeline. We developed a practical algorithm for model inference and validated its performance through simulations. CoLaML outperformed previous models in fitting empirical datasets and estimated plausible evolutionary histories, capturing heterogeneity among clades and gene families without prior knowledge. Availability: CoLaML is freely available at https://github.com/mtnouchi/colaml.
Bioinformatics
What problem does this paper attempt to address?