Locating Large-Scale Gene Duplication Events through Reconciled Trees: Implications for Identifying Ancient Polyploidy Events in Plants

J.G. Burleigh,M.S. Bansal,A. Wehe,O. Eulenstein
DOI: https://doi.org/10.1089/cmb.2009.0139
IF: 1.549
2009-08-01
Journal of Computational Biology
Abstract:Recent analyses of plant genomic data have found extensive evidence of ancient whole genome duplication (or polyploidy) events, but there are many unresolved questions regarding the number and timing of such events in plant evolutionary history. We describe the first exact and efficient algorithm for the Episode Clustering problem, which, given a collection of rooted gene trees and a rooted species tree, seeks the minimum number of locations on the species tree of gene duplication events. Solving this problem allows one to place gene duplication events onto nodes of a given species tree and potentially detect large-scale gene duplication events. We examined the performance of an implementation of our algorithm using 85 plant gene trees that contain genes from a total of 136 plant taxa. We found evidence of large-scale gene duplication events in Populus, Gossypium, Poaceae, Asteraceae, Brassicaceae, Solanaceae, Fabaceae, and near the root of the eudicot clade that are consistent with previous genomic evidence. However, a lack of phylogenetic signal within the gene trees can produce erroneous evidence of large-scale duplication events, especially near the root of the species tree. Although the results of our algorithm should be interpreted cautiously, they provide hypotheses for precise locations of large-scale gene duplication events with data from relatively few gene trees and can complement other genomic approaches to provide a more comprehensive view of ancient large-scale gene duplication events.
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology,computer science, interdisciplinary applications,statistics & probability
What problem does this paper attempt to address?