Terrace Aware Phylogenomic Inference from Supermatrices

Olga Chernomor,Arndt von Haeseler,Bui Quang Minh
DOI: https://doi.org/10.48550/arXiv.1411.3480
2014-11-13
Abstract:One approach in phylogenomics to infer the tree of life is based on concatenated multiple sequence alignments from many genes. Unfortunately, the resulting so-called supermatrix is usually sparse, that is, not every gene sequence is available for all species in the supermatrix. Due to the missing sequence information a phylogenetic inference, assuming that each gene evolves with its own substitution model, suffers from phylogenetic terraces on which many phylogenetic trees show the same likelihood. Here, we propose a phylogenetic terrace aware (PTA) data structure for efficient supermatrix based tree inference under partition models. PTA avoids likelihood computations for trees belonging to the same terrace. PTA is implemented in the IQ-TREE software, and leads to an 1.7 to 6-fold speedup for real data sets compared with a naïve implementation. Speedups are independent on terrace sizes but correlate with the amount of missing data. Thus, the PTA data structure is well suited for phylogenomic analyses. IQ-TREE source codes, binaries and documentation are freely available at <a class="link-external link-http" href="http://www.cibiv.at/software/iqtree" rel="external noopener nofollow">this http URL</a> .
Populations and Evolution
What problem does this paper attempt to address?