Coordinate System Selection for Minimum Error Rate Training in Statistical Machine Translation

Chen Lijiang
DOI: https://doi.org/10.48550/arXiv.1405.2434
2014-05-10
Abstract:Minimum error rate training (MERT) is a widely used training procedure for statistical machine translation. A general problem of this approach is that the search space is easy to converge to a local optimum and the acquired weight set is not in accord with the real distribution of feature functions. This paper introduces coordinate system selection (RSS) into the search algorithm for MERT. Contrary to previous approaches in which every dimension only corresponds to one independent feature function, we create several coordinate systems by moving one of the dimensions to a new direction. The basic idea is quite simple but critical that the training procedure of MERT should be based on a coordinate system formed by search directions but not directly on feature functions. Experiments show that by selecting coordinate systems with tuning set results, better results can be obtained without any other language knowledge.
Computation and Language
What problem does this paper attempt to address?