Integrating Word Embeddings and Traditional NLP Features to Measure Textual Entailment and Semantic Relatedness of Sentence Pairs

Jiang Zhao,Man Lan,Zheng-Yu Niu,Yue Lu
DOI: https://doi.org/10.1109/ijcnn.2015.7280462
2015-01-01
Abstract:Recent years the distributed representations of words (i.e., word embeddings) have been shown to be able to significantly improve performance in many natural language processing tasks, such as pos-of-tag tagging, chunking, named entity recognition and sentiment polarity judgement, etc. However, previous tasks only involve a single sentence. In contrast, this paper evaluates the effectiveness of word embeddings in sentence pair classification or regression problems. Specifically, we propose novel simple yet effective features based on word embeddings and extract many traditional linguistic features. Then these features serve as input of a classification/regression algorithm in isolation and in combination. Evaluations are conducted on three sentence pair classification/regression tasks, i.e., textual entailment, cross-lingual textual entailment and semantic relatedness estimation. Experiments on benchmark datasets provided by Semantic Evaluation 2013 and 2014 showed that using word embeddings is able to significantly improve the performance and our results outperform the best achieved results so far.
What problem does this paper attempt to address?