Fusing Syntax and Word Embedding Knowledge for Measuring Semantic Similarity

Zhuo Tang,Li Zhu,Yuquan Le,Kenli Li,Colin Cai
DOI: https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00062
2019-01-01
Abstract:The explosive growth of information makes it an important issue to effectively mine useful information from massive information. Text is an important carrier of information, so the processing and analysis of text has become one of the hot spots of data mining and information retrieval. Sentence similarity is the basis of most text-related tasks. The majority of current approaches leverage pairwise similarity characteristics to represent text pairs. Unlike the current approaches, we propose a new method to analyze and quantify the semantic textual similarity between sentences by encoding semantic knowledge based on word embedding into the syntax tree of sentences. We use SemEval-2012 task to test our approach and evaluate the performance with two widely used benchmarks:the Spearman and Pearson correlations, the experimental results show that compared with the best systems of semantic textual similarity (STS) task, our method can effectively improve the accuracy of sentence similarity judgment.
What problem does this paper attempt to address?