Knowledge-Enhanced Bilingual Textual Representations for Cross-Lingual Semantic Textual Similarity

Hsuehkuan Lu,Yixin Cao,Hou Lei,Juanzi Li
DOI: https://doi.org/10.1007/978-981-15-0118-0_33
2019-01-01
Abstract:Joint learning of words and entities is advantageous to various NLP tasks, while most of the works focus on single language setting. Cross-lingual representations learning receives high attention recently, but is still restricted by the availability of parallel data. In this paper, a method is proposed to jointly embed texts and entities on comparable data. In addition to evaluate on public semantic textual similarity datasets, a task (cross-lingual text extraction) was proposed to assess the similarities between texts and contribute to this dataset. It shows that the proposed method outperforms cross-lingual representations methods using parallel data on cross-lingual tasks, and achieves competitive results on mono-lingual tasks.
What problem does this paper attempt to address?