Reconstructing protein and gene phylogenies by extending the framework of reconciliation

Esaie Kuitche,Manuel Lafond,Aïda Ouangraoua
DOI: https://doi.org/10.48550/arXiv.1610.09732
2017-07-04
Abstract:The architecture of eukaryotic coding genes allows the production of several different protein isoforms by genes. Current gene phylogeny reconstruction methods make use of a single protein product per gene, ignoring information on alternative protein isoforms. These methods often lead to inaccurate gene tree reconstructions that require to be corrected before being used in phylogenetic tree reconciliation analyses or gene products phylogeny reconstructions. Here, we propose a new approach for the reconstruction of accurate gene trees and protein trees accounting for the production of alternative protein isoforms by the genes of a gene family. We extend the concept of reconciliation to protein trees, and we define a new reconciliation problem called MinDRGT that consists in finding a gene tree that minimizes a double reconciliation cost with a given protein tree and a given species tree. We define a second problem called MinDRPGT that consists in finding a protein tree and a gene tree minimizing a double reconciliation cost, given a species tree and a set of protein subtrees. We provide algorithmic exact and heuristic solutions for some versions of the problems, and we present the results of an application to the correction of gene trees from the Ensembl database. An implementation of the heuristic method is available at <a class="link-external link-https" href="https://github.com/UdeS-CoBIUS/Protein2GeneTree" rel="external noopener nofollow">this https URL</a>.
Data Structures and Algorithms,Populations and Evolution,Quantitative Methods
What problem does this paper attempt to address?