Removing Input Confounder for Translation Quality Estimation via a Causal Motivated Method

Xuewen Shi,Heyan Huang,Ping Jian,Yi-Kun Tang
DOI: https://doi.org/10.1007/978-3-030-85896-4_28
2021-01-01
Abstract:Most state-of-the-art QE systems built upon neural networks have achieved promising performances on benchmark datasets. However, the performance of these methods can be easily influenced by the inherent features of the model input, such as the length of input sequence or the number of unseen tokens. In this paper, we introduce a causal inference based method to eliminate the negative impact caused by the characters of the input for a QE system. Specifically, we propose an iterative denoising framework for multiple confounding features. The confounder elimination operation at each iteration step is implemented by a Half-Sibling Regression based method. We conduct our experiments on the official datasets and submissions fromWMT 2020 Quality Estimation Shared Task of Sentence-Level Direct Assessment. Experimental results show that the denoised QE results gain better Pearson's correlation scores with human assessments compared to the original submissions.
What problem does this paper attempt to address?