SemMT: A Semantic-based Testing Approach for Machine Translation Systems

Jialun Cao,Meiziniu Li,Yeting Li,Ming Wen,Shing-Chi Cheung

DOI: https://doi.org/10.1145/3490488

2021-10-09

Abstract:Machine translation has wide applications in daily life. In mission-critical applications such as translating official documents, incorrect translation can have unpleasant or sometimes catastrophic consequences. This motivates recent research on testing methodologies for machine translation systems. Existing methodologies mostly rely on metamorphic relations designed at the textual level (e.g., Levenshtein distance) or syntactic level (e.g., the distance between grammar structures) to determine the correctness of translation results. However, these metamorphic relations do not consider whether the original and translated sentences have the same meaning (i.e., Semantic similarity). Therefore, in this paper, we propose SemMT, an automatic testing approach for machine translation systems based on semantic similarity checking. SemMT applies round-trip translation and measures the semantic similarity between the original and translated sentences. Our insight is that the semantics expressed by the logic and numeric constraint in sentences can be captured using regular expressions (or deterministic finite automata) where efficient equivalence/similarity checking algorithms are available. Leveraging the insight, we propose three semantic similarity metrics and implement them in SemMT. The experiment result reveals SemMT can achieve higher effectiveness compared with state-of-the-art works, achieving an increase of 21% and 23% on accuracy and F-Score, respectively. We also explore potential improvements that can be achieved when proper combinations of metrics are adopted. Finally, we discuss a solution to locate the suspicious trip in round-trip translation, which may shed lights on further exploration.

Software Engineering,Computation and Language

What problem does this paper attempt to address?

The problem that this paper attempts to solve is that in machine translation systems, existing testing methods mainly rely on meta - state relationships at the text level (such as Levenshtein distance) or the syntactic level (such as the distance between grammatical structures) to judge the correctness of translation results. However, these methods do not consider whether the original sentence and the translated sentence have the same meaning (i.e., semantic similarity). To meet this challenge, this paper proposes SemMT, an automatic testing method based on semantic similarity checking, aiming to detect mistranslations in machine translation systems. Specifically, the paper points out: - **Existing problems**: Existing testing methods mainly focus on text or syntactic - level similarity and ignore semantic - level similarity. For example, two sentences may be very close in text or syntax, but their semantics may be completely different. - **Solutions**: A new testing method - SemMT is proposed. This method evaluates the accuracy of translation through round - trip translation and semantic similarity checking. SemMT uses regular expressions (or deterministic finite - state automata) to capture logical relationships and quantifiers in sentences and uses efficient semantic equivalence / similarity checking algorithms to quantify semantic similarity. - **Innovations**: SemMT proposes three semantic similarity measurement methods and verifies their effectiveness in experiments. The experimental results show that SemMT improves by 34.2% and 15.4% in terms of accuracy and F - Score respectively. In conclusion, the main contribution of this paper is to propose a machine translation system testing framework based on semantic similarity, which can detect mistranslations more effectively, especially in sentences involving quantifiers and logical relationships.

SemMT: A Semantic-based Testing Approach for Machine Translation Systems

Word Closure-Based Metamorphic Testing for Machine Translation

Evaluating Terminology Translation in Machine Translation Systems Via Metamorphic Testing

Differential Testing of Machine Translators Based on Compositional Semantics

Application and Analysis of Sentence Similarity Based Machine Translation Evaluation

SSMT:A Machine Translation Evaluation View to Paragraph-to-Sentence Semantic Similarity

Fairness Testing of Machine Translation Systems

Machine Translation Testing via Syntactic Tree Pruning

Better Simultaneous Translation with Monotonic Knowledge Distillation.

Evaluation of Machine Translation Based on Semantic Dependencies and Keywords

STD: an Automatic Evaluation Metric for Machine Translation Based on Word Embeddings

Semantic Analysis and Evaluation of Translation Based on Abstract Meaning Representation

A Study on Automatic Scoring for Machine Translation Systems

Rethinking the Reasonability of the Test Set for Simultaneous Machine Translation

Back Deduction Based Testing for Word Sense Disambiguation Ability of Machine Translation Systems

Application and Analysis of String-Similarity-Based Machine Translation Evaluation

Context Consistency Between Training and Testing in Simultaneous Machine Translation.

Automated Testing for Machine Translation Via Constituency Invariance

On the effectiveness of testing sentiment analysis systems with metamorphic testing