Learning Word Embeddings for Paraphrase Scoring in Knowledge Base Based Question Answering

Chendi ZHAN,Zhenhua LING,Lirong DAI
DOI: https://doi.org/10.16451/j.cnki.issn1003-6059.201609007
2017-01-01
Abstract:The conventional word embeddings are learned from the co-occurrence probabilities between the words within a same sentence. The learning algorithm is task-independent and unsupervised. A method for constructing word embeddings is proposed by utilizing the constraints of paraphrasing to improve the performance of paraphrase scoring with word embeddings and bag-of-words model in knowledge base ( KB) based question answering ( QA) . In the proposed method, the pairs of paraphrase questions and non-paraphrase questions are collected respectively from a database of question paraphrases according to some designed rules. Then, the inequalities describing the similarities between the pairs of questions are adopted to represent the semantic constraint at the sentence level. These inequalities are integrated into the objective function for training word embeddings. Experimental results show that the proposed method improves the accuracies of paraphrase scoring and KB-based question answering compared with conven-tional word embedding methods.
What problem does this paper attempt to address?