Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns
Robert K. Jansen,Zhengqiu Cai,Linda A. Raubeson,Henry Daniell,Claude W. dePamphilis,James Leebens-Mack,Kai F. Müller,Mary Guisinger-Bellian,Rosemarie C. Haberle,Anne K. Hansen,Timothy W. Chumley,Seung-Bum Lee,Rhiannon Peery,Joel R. McNeal,Jennifer V. Kuehl,Jeffrey L. Boore
DOI: https://doi.org/10.1073/pnas.0709121104
IF: 11.1
2007-12-04
Proceedings of the National Academy of Sciences
Abstract:Angiosperms are the largest and most successful clade of land plants with >250,000 species distributed in nearly every terrestrial habitat. Many phylogenetic studies have been based on DNA sequences of one to several genes, but, despite decades of intensive efforts, relationships among early diverging lineages and several of the major clades remain either incompletely resolved or weakly supported. We performed phylogenetic analyses of 81 plastid genes in 64 sequenced genomes, including 13 new genomes, to estimate relationships among the major angiosperm clades, and the resulting trees are used to examine the evolution of gene and intron content. Phylogenetic trees from multiple methods, including model-based approaches, provide strong support for the position of Amborella as the earliest diverging lineage of flowering plants, followed by Nymphaeales and Austrobaileyales. The plastid genome trees also provide strong support for a sister relationship between eudicots and monocots, and this group is sister to a clade that includes Chloranthales and magnoliids. Resolution of relationships among the major clades of angiosperms provides the necessary framework for addressing numerous evolutionary questions regarding the rapid diversification of angiosperms. Gene and intron content are highly conserved among the early diverging angiosperms and basal eudicots, but 62 independent gene and intron losses are limited to the more derived monocot and eudicot clades. Moreover, a lineage-specific correlation was detected between rates of nucleotide substitutions, indels, and genomic rearrangements.