A Structured Variational Autoencoder for Contextual Morphological Inflection

Lawrence Wolf-Sonkin,Jason Naradowsky,Sabrina J. Mielke,Ryan Cotterell
DOI: https://doi.org/10.48550/arXiv.1806.03746
2018-06-10
Computation and Language
Abstract:Statistical morphological inflectors are typically trained on fully supervised, type-level data. One remaining open research question is the following: How can we effectively exploit raw, token-level data to improve their performance? To this end, we introduce a novel generative latent-variable model for the semi-supervised learning of inflection generation. To enable posterior inference over the latent variables, we derive an efficient variational inference procedure based on the wake-sleep algorithm. We experiment on 23 languages, using the Universal Dependencies corpora in a simulated low-resource setting, and find improvements of over 10% absolute accuracy in some cases.
What problem does this paper attempt to address?