Abstract:Machine translation has wide applications in daily life. In mission-critical applications such as translating official documents, incorrect translation can have unpleasant or sometimes catastrophic consequences. This motivates recent research on testing methodologies for machine translation systems. Existing methodologies mostly rely on metamorphic relations designed at the textual level (e.g., Levenshtein distance) or syntactic level (e.g., the distance between grammar structures) to determine the correctness of translation results. However, these metamorphic relations do not consider whether the original and translated sentences have the same meaning (i.e., Semantic similarity). Therefore, in this paper, we propose SemMT, an automatic testing approach for machine translation systems based on semantic similarity checking. SemMT applies round-trip translation and measures the semantic similarity between the original and translated sentences. Our insight is that the semantics expressed by the logic and numeric constraint in sentences can be captured using regular expressions (or deterministic finite automata) where efficient equivalence/similarity checking algorithms are available. Leveraging the insight, we propose three semantic similarity metrics and implement them in SemMT. The experiment result reveals SemMT can achieve higher effectiveness compared with state-of-the-art works, achieving an increase of 21% and 23% on accuracy and F-Score, respectively. We also explore potential improvements that can be achieved when proper combinations of metrics are adopted. Finally, we discuss a solution to locate the suspicious trip in round-trip translation, which may shed lights on further exploration.

Referential translation machines for predicting semantic similarity

Predicting Word Similarity in Context with Referential Translation Machines

Identifying Intensity of the Structure and Content in Tweets and the Discriminative Power of Attributes in Context with Referential Translation Machines

MT-Ranker: Reference-free machine translation evaluation by inter-system ranking

Simulated Multiple Reference Training Improves Low-Resource Machine Translation

Exploring the Correlation between Human and Machine Evaluation of Simultaneous Speech Translation

Testing Machine Translation via Referential Transparency

SemMT: A Semantic-based Testing Approach for Machine Translation Systems

Predicting Machine Translation Performance on Low-Resource Languages: The Role of Domain Similarity

UMBCLU at SemEval-2024 Task 1A and 1C: Semantic Textual Relatedness with and without machine translation

Translation Memory Retrieval Methods

Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models

Evaluation of Semantic Answer Similarity Metrics

Machine Translation of Low-Resource Indo-European Languages

On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation

Lost in Interpretation: Predicting Untranslated Terminology in Simultaneous Interpretation

Improving Multilingual Semantic Textual Similarity with Shared Sentence Encoder for Low-resource Languages

Semantic Analysis and Evaluation of Translation Based on Abstract Meaning Representation

Towards Large Language Model driven Reference-less Translation Evaluation for English and Indian Languages

Rethinking Round-Trip Translation for Machine Translation Evaluation

Transcrib3D: 3D Referring Expression Resolution through Large Language Models