Cross-Lingual Entity Matching for Heterogeneous Online Wikis.

Weiming Lu,Peng Wang,Huan Wang,Jiahui Liu,Hao Dai,Baogang Wei
DOI: https://doi.org/10.1007/978-3-319-73618-1_78
2017-01-01
Abstract:Knowledge bases play an increasing important role in many applications. However, many knowledge bases mainly focus on English knowledge, and have only a few knowledge for low-resource languages (LLs). If we can map the entities in LLs to these in high-resource languages (HLs), many knowledge such as relation between entities can be transferred from HLs to LLs. In this paper, we propose an efficient and effective Cross-Lingual Entity Matching approach (CL-EM) to enrich the existing cross-lingual links by learning to rank framework with the learned language-independent features, including cross-lingual topic features and document embedding features. In the experiments, we verified our approach on the existing cross-lingual links between Chinese Wikipedia and English Wikipedia by comparing it with other state-of-art approaches. In addition, we also discovered 141,754 new cross-lingual links between Baidu Baike and English Wikipedia, which almost doubles the number of the existing cross-lingual links.
What problem does this paper attempt to address?