Revisiting Correlations between Intrinsic and Extrinsic Evaluations of Word Embeddings.

Yuanyuan Qiu,Hongzheng Li,Shen Li,Yingdi Jiang,Renfen Hu,Lijiao Yang
DOI: https://doi.org/10.1007/978-3-030-01716-3_18
2018-01-01
Abstract:The evaluation of word embeddings has received a considerable amount of attention in recent years, but there have been some debates about whether intrinsic measures can predict the performance of downstream tasks. To investigate this question, this paper presents the first study on the correlation between results of intrinsic evaluation and extrinsic evaluation with Chinese word embeddings. We use word similarity and word analogy as the intrinsic tasks, Named Entity Recognition and Sentiment Classification as the extrinsic tasks. A variety of Chinese word embeddings trained with different corpora and context features are used in the experiments. From the data analysis, we reach some interesting conclusions: there are strong correlations between intrinsic and extrinsic evaluations, and the performance of different tasks can be affected by training corpora and context features to varying degrees.
What problem does this paper attempt to address?