A Chinese Text Paraphrase Detection Method Based on Dependency Tree.
Yipeng Jiang,Yu Hao,Xiaoyan Zhu
DOI: https://doi.org/10.1109/icnsc.2016.7479003
2016-01-01
Abstract:Paraphrase detection is regarded as an important subtask in lots of natural language processing tasks. For example, in question answering, finding similar relations between questions needs paraphrase detection, also it is widely used in information retrieval, machine translation, document clustering, etc. Traditional solutions are mainly divided to two types. One is based on bag of words, which only considers the words in the sentences and similarity degrees between words. The other type is based on word embedding and deep neural networks, which learns word vectors to sentence vectors in deep models, in these models, deep layers may represent deep information in a sentence like phrase information and syntactic information, but these models may also lose some sentence information. We proposed a new method that considers word similarity and also directly uses dependency relations in sentences. We train our model in a Chinese text corpus. By working out dependency relation similarities and word similarities, we decide whether a sentence is a paraphrase of another one.