Non-Parametric Bayesian Areal Linguistics

Hal Daumé III
DOI: https://doi.org/10.48550/arXiv.0906.5114
2009-06-28
Abstract:We describe a statistical model over linguistic areas and phylogeny. Our model recovers known areas and identifies a plausible hierarchy of areal features. The use of areas improves genetic reconstruction of languages both qualitatively and quantitatively according to a variety of metrics. We model linguistic areas by a Pitman-Yor process and linguistic phylogeny by Kingman's coalescent.
Computation and Language
What problem does this paper attempt to address?