Abstract:Phylodynamics and diversification studies using complex evolutionary models can be challenging, especially with traditional likelihood-based approaches. As an alternative, likelihood-free simulation-based approaches have been proposed due to their ability to incorporate complex models and scenarios. Here, we propose a new simulation-based deep learning (DL) method capable of analyzing large datasets and accurately estimating parameter values for birth-death models in both phylodynamics and diversification studies. Our approach involves encoding trees by extracting a vector of local features for all nodes of the input phylogeny. We also developed a dedicated convolutional neural network architecture called PhyloCNN. Using simulations, we compared the accuracy of PhyloCNN when using feature vectors with a variable number of generations to describe the local context of nodes and leaves. The number of generations had a greater impact when considering smaller training sets, with a broader context showing higher accuracy, especially for complex evolutionary models. Compared to other recently developed DL approaches, PhyloCNN showed higher or similar accuracies for all parameters when used with training sets one or two orders of magnitude smaller (10,000 to 100,000 simulated training trees, instead of millions). We applied PhyloCNN with compelling results to two real-world phylodynamics and diversification datasets, related to HIV superspreaders in Zurich and to primates and their ecological role as seed dispersers. The high accuracy and computational efficiency of our method opens new possibilities for phylodynamics and diversification studies that need to account for idiosyncratic phylogenetic histories with specific parameter spaces and sampling scenarios not considered in more general approaches.

Deep Neural Networks and the Tree of Life

Discovering Novel Biological Traits From Images Using Phylogeny-Guided Neural Networks

Impacts of Darwinian Evolution on Pre-trained Deep Neural Networks

A Review of Artificial Intelligence based Biological-Tree Construction: Priorities, Methods, Applications and Trends

Novel Symmetry-preserving Neural Network Model for Phylogenetic Inference

PhyloCNN: Improving tree representation and neural network architecture for deep learning from trees in phylodynamics and diversification studies

In the Light of Deep Coalescence: Revisiting Trees Within Networks

A Novel Biologically Inspired ELM-based Network for Image Recognition

Learn Decision Trees with Deep Visual Primitives

Self-born Wiring for Neural Trees

An Improved Res-UNet Model for Tree Species Classification Using Airborne High-Resolution Images

Reliable estimation of tree branch lengths using deep neural networks

Visual Genealogy of Deep Neural Networks

ARTree: A Deep Autoregressive Model for Phylogenetic Inference

Deep Connectomics Networks: Neural Network Architectures Inspired by Neuronal Networks

Finding Better Topologies for Deep Convolutional Neural Networks by Evolution

Evolving Deep Convolutional Neural Networks for Image Classification

BioCLIP: A Vision Foundation Model for the Tree of Life

What Do You See in Common? Learning Hierarchical Prototypes over Tree-of-Life to Discover Evolutionary Traits

Deep Learning of Path-Based Tree Classifiers for Large-Scale Plant Species Identification

Deep neural networks: a new framework for modelling biological vision and brain information processing