An Integrated Reconciliation Framework for Domain, Gene, and Species Level Evolution

Lei Li,Mukul S. Bansal
DOI: https://doi.org/10.1109/tcbb.2018.2846253
2019-01-01
Abstract:The majority of genes in eukaryotes consists of one or more protein domains that can be independently lost or gained during evolution. This gain and loss of protein domains, through domain duplications, transfers, or losses, has important evolutionary and functional consequences. Yet, even though it is well understood that domains evolve inside genes and genes inside species, there do not exist any computational frameworks to simultaneously model the evolution of domains, genes, and species and account for their inter-dependency. Here, we develop an integrated model of domain evolution that explicitly captures the interdependence of domain-, gene-, and species-level evolution. Our model extends the classical phylogenetic reconciliation framework, which infers gene family evolution by comparing gene trees and species trees, by explicitly considering domain-level evolution and decoupling domain-level events from gene-level events. In this paper, we (i) introduce the new integrated reconciliation framework, (ii) prove that the associated optimization problem is NP-hard, (iii) devise an efficient heuristic solution for the problem, (iv) apply our algorithm to a large biological dataset, and (v) demonstrate the impact of using our new computational framework compared to existing approaches. The implemented software is freely available from http://compbio.engr.uconn.edu/software/seadog/.
What problem does this paper attempt to address?