ECNU: Using Traditional Similarity Measurements and Word Embedding for Semantic Textual Similarity Estimation.

Jiang Zhao,Man Lan,Junfeng Tian
DOI: https://doi.org/10.18653/v1/s15-2021
2015-01-01
Abstract:This paper reports our submissions to semantic textual similarity task, i.e., task 2 in Semantic Evaluation 2015. We built our systems using various traditional features, such as string-based, corpus-based and syntactic similarity metrics, as well as novel similarity measures based on distributed word representations, which were trained using deep learning paradigms. Since the training and test datasets consist of instances collected from various domains, three different strategies of the usage of training datasets were explored: (1) use all available training datasets and build a unified supervised model for all test datasets; (2) select the most similar training dataset and separately construct a individual model for each test set; (3) adopt multi-task learning framework to make full use of available training sets. Results on the test datasets show that using all datasets as training set achieves the best averaged performance and our best system ranks 15 out of 73.
What problem does this paper attempt to address?