String Transduction with Target Language Models and Insertion Handling

Garrett Nicolai,Saeed Najafi,Grzegorz Kondrak
DOI: https://doi.org/10.48550/arXiv.1809.07182
2018-09-19
Abstract:Many character-level tasks can be framed as sequence-to-sequence transduction, where the target is a word from a natural language. We show that leveraging target language models derived from unannotated target corpora, combined with a precise alignment of the training data, yields state-of-the art results on cognate projection, inflection generation, and phoneme-to-grapheme conversion.
Computation and Language
What problem does this paper attempt to address?