XLORE 3: A Large-scale Multilingual Knowledge Graph from Heterogeneous Wiki Knowledge Resources
Kaisheng Zeng,Hailong Jin,Xin Lv,Fangwei Zhu,Lei Hou,Yi Zhang,Fan Pang,Yu Qi,Dingxiao Liu,Juanzi Li,Ling Feng
DOI: https://doi.org/10.1145/3660521
2024-01-01
Abstract:In recent years, Knowledge Graph (KG) has attracted significant attention from academia and industry, resulting in the development of numerous technologies for KG construction, completion, and application. XLORE is one of the largest multilingual KGs built from Baidu Baike and Wikipedia via a series of knowledge modelling and acquisition methods. In this paper, we utilize systematic methods to improve XLORE’s data quality and present its latest version, XLORE 3, which enables the effective integration and management of heterogeneous knowledge from diverse resources. Compared with previous versions, XLORE 3 has three major advantages: 1) We design a comprehensive and reasonable schema, namely XLORE ontology, which can effectively organize and manage entities from various resources. 2) We merge equivalent entities in different languages to facilitate knowledge sharing. We provide a large-scale entity linking system to establish the associations between unstructured text and structured KG. 3) We design a multi-strategy knowledge completion framework, which leverages pre-trained language models and vast amounts of unstructured text to discover missing and new facts. The resulting KG contains 446 concepts, 2,608 properties, 66 million entities, and more than 2 billion facts. It is available and downloadable online 1 , providing a valuable resource for researchers and practitioners in various fields.