A Comparative Study on Vocabulary Reduction for Phrase Table Smoothing

Yunsu Kim,Andreas Guta,Joern Wuebker,Hermann Ney
DOI: https://doi.org/10.48550/arXiv.1901.01574
2019-01-07
Abstract:This work systematically analyzes the smoothing effect of vocabulary reduction for phrase translation models. We extensively compare various word-level vocabularies to show that the performance of smoothing is not significantly affected by the choice of vocabulary. This result provides empirical evidence that the standard phrase translation model is extremely sparse. Our experiments also reveal that vocabulary reduction is more effective for smoothing large-scale phrase tables.
Computation and Language
What problem does this paper attempt to address?