Non-bifurcating Phylogenetic Tree Inference Via the Adaptive LASSO

Cheng Zhang,Vu Dinh,Frederick A. Matsen
DOI: https://doi.org/10.1080/01621459.2020.1778481
IF: 4.369
2020-01-01
Journal of the American Statistical Association
Abstract:Phylogenetic tree inference using deep DNA sequencing is reshaping our understanding of rapidly evolving systems, such as the within-host battle between viruses and the immune system. Densely sampled phylogenetic trees can contain special features, including sampled ancestors in which we sequence a genotype along with its direct descendants, and polytomies in which multiple descendants arise simultaneously. These features are apparent after identifying zero-length branches in the tree. However, current maximum-likelihood based approaches are not capable of revealing such zero-length branches. In this paper, we find these zero-length branches by introducing adaptive-LASSO-type regularization estimators for the branch lengths of phylogenetic trees, deriving their properties, and showing regularization to be a practically useful approach for phylogenetics.
What problem does this paper attempt to address?