Abstract:Divergence time estimation-the calibration of a phylogeny to geological time-is an integral first step in modeling the tempo of biological evolution (traits and lineages). However, despite increasingly sophisticated methods to infer divergence times from molecular genetic sequences, the estimated age of many nodes across the tree of life contrast significantly and consistently with timeframes conveyed by the fossil record. This is perhaps best exemplified by crown angiosperms, where molecular clock (Triassic) estimates predate the oldest (Early Cretaceous) undisputed angiosperm fossils by tens of millions of years or more. While the incompleteness of the fossil record is a common concern, issues of data limitation and model inadequacy are viable (if underexplored) alternative explanations. In this vein, Beaulieu et al. (2015) convincingly demonstrated how methods of divergence time inference can be misled by both (i) extreme state-dependent molecular substitution rate heterogeneity and (ii) biased sampling of representative major lineages. These results demonstrate the impact of (potentially common) model violations. Here, we suggest another potential challenge: that the configuration of the statistical inference problem (i.e., the parameters, their relationships, and associated priors) alone may preclude the reconstruction of the paleontological timeframe for the crown age of angiosperms. We demonstrate, through sampling from the joint prior (formed by combining the tree (diversification) prior with the calibration densities specified for fossil-calibrated nodes) that with no data present at all, that an Early Cretaceous crown angiosperms is rejected (i.e., has essentially zero probability). More worrisome, however, is that for the 24 nodes calibrated by fossils, almost all have indistinguishable marginal prior and posterior age distributions when employing routine lognormal fossil calibration priors. These results indicate that there is inadequate information in the data to over-rule the joint prior. Given that these calibrated nodes are strategically placed in disparate regions of the tree, they act to anchor the tree scaffold, and so the posterior inference for the tree as a whole is largely determined by the pseudodata present in the (often arbitrary) calibration densities. We recommend, as for any Bayesian analysis, that marginal prior and posterior distributions be carefully compared to determine whether signal is coming from the data or prior belief, especially for parameters of direct interest. This recommendation is not novel. However, given how rarely such checks are carried out in evolutionary biology, it bears repeating. Our results demonstrate the fundamental importance of prior/posterior comparisons in any Bayesian analysis, and we hope that they further encourage both researchers and journals to consistently adopt this crucial step as standard practice. Finally, we note that the results presented here do not refute the biological modeling concerns identified by Beaulieu et al. (2015). Both sets of issues remain apposite to the goals of accurate divergence time estimation, and only by considering them in tandem can we move forward more confidently.

The Past Sure is Tense: On Interpreting Phylogenetic Divergence Time Estimates

Ad fontes: divergence‐time estimation and the age of angiosperms

Calibrated Tree Priors for Relaxed Phylogenetics and Divergence Time Estimation

Combined Sum of Squares Penalties for Molecular Divergence Time Estimation

Extant timetrees are consistent with a myriad of diversification histories

The Making of Calibration Sausage Exemplified by Recalibrating the Transcriptomic Timetree of Jawed Vertebrates

Factors influencing the accuracy and precision in dating single gene trees

Conflicting Timelines: Exploring patterns of mito-nuclear discordance in divergence estimates among tetrapods

Relaxed Phylogenetics and Dating with Confidence

Diversity-dependence brings molecular phylogenies closer to agreement with the fossil record

Estimating Diversity Through Time using Molecular Phylogenies: Old and Species-Poor Frog Families are the Remnants of a Diverse Past

Bayesian Selection of Relaxed-clock Models: Distinguishing Between Independent and Autocorrelated Rates

Probabilistic modelling improves relative dating from gene phylogenies

Sequential Bayesian Phylogenetic Inference

A tale of too many trees: a conundrum for phylogenetic regression

Chronospaces: an R package for the statistical exploration of divergence times promotes the assessment of methodological sensitivity

Assessing the effect of model specification and prior sensitivity on Bayesian tests of temporal signal

A method for investigating relative timing information on phylogenetic trees

The inseparability of sampling and time and its influence on attempts to unify the molecular and fossil records

Tip dating and Bayes factors provide insight into the divergences of crown bird clades across the end-Cretaceous mass extinction