Two Metrics on Rooted Unordered Trees with Labels

Yue Wang
DOI: https://doi.org/10.1186/s13015-022-00220-0
2022-05-21
Abstract:The early development of a zygote can be mathematically described by a developmental tree. To compare developmental trees of different species, we need to define distances on trees. If children cells after a division are not distinguishable, developmental trees are represented by the space $\mathcal{T}$ of rooted trees with possibly repeated labels, where all vertices are unordered. If children cells after a division are partially distinguishable, developmental trees are represented by the space $\mathcal{P}$ of rooted trees with possibly repeated labels, where vertices can be ordered or unordered. On $\mathcal{T}$, the space of rooted unordered trees with possibly repeated labels, we define two metrics: the best-match metric and the left-regular metric, which show some advantages over existing methods. On $\mathcal{P}$, the space of rooted labeled trees with ordered or unordered vertices, there is no metric, and we define a semimetric, which is a variant of the best-match metric. To compute the best-match distance between two trees, the expected time complexity and worst-case time complexity are both $\mathcal{O}(n^2)$, where $n$ is the tree size. To compute the left-regular distance between two trees, the expected time complexity is $\mathcal{O}(n)$, and the worst-case time complexity is $\mathcal{O}(n\log n)$. For rooted labeled trees with (fully/partially) unordered vertices, we define metrics (semimetric) that have fast algorithms to compute and have advantages over existing methods. Such trees also appear outside of developmental biology, and such metrics can be applied to other types of trees which have more extensive applications, especially in molecular biology.
Combinatorics,Discrete Mathematics
What problem does this paper attempt to address?