Cross-Language Information Retrieval Based on Multiple Information

Pengyuan Liu,Zhijun Zheng,Qi Su
DOI: https://doi.org/10.1109/WI.2018.00-26
2018-01-01
Abstract:As predicted by Internet Data Center (IDC), the amount of global language data will exceed 40ZB by 2020. With the globalization of information, it has become an urgent matter for current web retrieval to break the barriers between languages. In this paper, we propose to integrate semantic and lexical information to deal with the task of cross-language information retrieval (CLIR). The approach does not rely on external knowledge bases thus to avoid that knowledge bases cannot deal with net neologism. Experiments on Sogou dataset show the feasibility of the approach.
What problem does this paper attempt to address?