Abstract:We seek to better understand the difference in quality of the several publicly released embeddings. We propose several tasks that help to distinguish the characteristics of different embeddings. Our evaluation of sentiment polarity and synonym/antonym relations shows that embeddings are able to capture surprisingly nuanced semantics even in the absence of sentence structure. Moreover, benchmarking the embeddings shows great variance in quality and characteristics of the semantics captured by the tested embeddings. Finally, we show the impact of varying the number of dimensions and the resolution of each dimension on the effective useful features captured by the embedding space. Our contributions highlight the importance of embeddings for NLP tasks and the effect of their quality on the final results.
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to understand the information encoded in word embeddings and to distinguish the characteristics of different publicly - released word embeddings through a series of tasks. Specifically, the paper focuses on the following aspects:
1. **Semantic - capturing ability of word embeddings**:
- The paper evaluates the ability of word embeddings to capture semantics without sentence structure. Research shows that word embeddings can capture surprisingly detailed semantic information.
2. **Quality differences among different word embeddings**:
- By benchmarking multiple publicly - released word embeddings, the paper shows significant differences in the quality and captured semantic features of these embeddings.
3. **The influence of dimension and resolution on the embedding space**:
- The research explores the influence of the number of dimensions in the embedding space and the resolution of each dimension on effective feature capture, revealing the minimum effective embedding space requirements.
4. **The importance of word - pair directions**:
- The paper emphasizes the importance of word - pair directions in encoding useful language information and verifies it through two word - pair classification tasks, in one of which the performance of word pairs is significantly better than that of single words.
### Specific problems and methods
To achieve the above goals, the paper adopts the following methods:
- **Select four publicly - available word - embedding datasets**: including HLBL, SENNA, Turian, and Huang's embeddings.
- **Design five classification tasks**:
- **Sentiment polarity**: Use Lydia's sentiment dictionary to create vocabulary sets of positive and negative sentiment.
- **Noun gender**: Use Bergsma's dataset to compile lists of male and female proper nouns.
- **Singular and plural forms**: Extract the singular and plural forms of nouns from WordNet.
- **Synonyms and antonyms**: Extract synonym and antonym pairs from WordNet.
- **Regional spelling differences**: Collect pairs of words with spelling differences between British English and American English.
### Main findings
- **Effectiveness of word embeddings**: All the considered word embeddings perform better than the baseline model in all tasks, especially in seemingly difficult tasks such as sentiment detection.
- **Advantages of SENNA embeddings**: It performs excellently in dealing with singular - plural relations, probably because it pays more attention to shallow syntactic features.
- **Influence of dimension and resolution**: Even if the resolution of the embedding space is greatly reduced (for example, only 1 bit is retained), the performance degradation is relatively small, indicating that the embedding space is highly robust.
- **Importance of word - pair classification**: In some tasks, such as regional spelling differences, the effect of word - pair classification is significantly better than that of single - word classification.
### Conclusion
The paper emphasizes the importance of word embeddings in natural language processing tasks and points out that there are significant differences in the quality and characteristics of different embedding models. Future research should further explore the factors affecting embedding quality, such as the size of the training corpus and the choice of the objective function.
Through these studies, the paper provides a comprehensive framework for understanding and evaluating word embeddings, which is helpful for promoting the further development of the field of natural language processing.