Translation as Cross-Domain Knowledge - Attention Augmentation for Unsupervised Cross-Domain Segmenting and Labeling Tasks.

Ruixuan Luo,Yi Zhang,Sishuo Chen,Xu Sun
DOI: https://doi.org/10.18653/v1/2021.findings-emnlp.163
2021-01-01
Abstract:The nature of no word delimiter or inflection that can indicate segment boundaries or word semantics increases the difficulty of Chinese text understanding, and also intensifies the demand for word-level semantic knowledge to accomplish the tagging goal in Chinese segmenting and labeling tasks. However, for unsupervised Chinese cross-domain segmenting and labeling tasks, the model trained on the source domain frequently suffers from the deficient word-level semantic knowledge of the target domain. To address this issue, we propose a novel paradigm based on attention augmentation to introduce crucial cross-domain knowledge via a translation system. The proposed paradigm enables the model attention to draw cross-domain knowledge indicated by the implicit word-level cross-lingual alignment between the input and its corresponding translation. Aside from the model requiring cross-lingual input, we also establish an off-the-shelf model which eludes the dependency on cross-lingual translations. Experiments demonstrate that our proposal significantly advances the state-of-the-art results of cross-domain Chinese segmenting and labeling tasks.
What problem does this paper attempt to address?