Abstract:Word embedding models such as GloVe are widely used in natural language processing (NLP) research to convert words into vectors. Here, we provide a preliminary guide to probe latent emotions in text through GloVe word vectors. First, we trained a neural network model to predict continuous emotion valence ratings by taking linguistic inputs from Stanford Emotional Narratives Dataset (SEND). After interpreting the weights in the model, we found that only a few dimensions of the word vectors contributed to expressing emotions in text, and words were clustered on the basis of their emotional polarities. Furthermore, we performed a linear transformation that projected high dimensional embedded vectors into an emotion space. Based on NRC Emotion Lexicon (EmoLex), we visualized the entanglement of emotions in the lexicon by using both projected and raw GloVe word vectors. We showed that, in the proposed emotion space, we were able to better disentangle emotions than using raw GloVe vectors alone. In addition, we found that the sum vectors of different pairs of emotion words successfully captured expressed human feelings in the EmoLex. For example, the sum of two embedded word vectors expressing Joy and Trust which express Love shared high similarity (similarity score .62) with the embedded vector expressing Optimism. On the contrary, this sum vector was dissimilar (similarity score -.19) with the the embedded vector expressing Remorse. In this paper, we argue that through the proposed emotion space, arithmetic of emotions is preserved in the word vectors. The affective representation uncovered in emotion vector space could shed some light on how to help machines to disentangle emotion expressed in word embeddings.

Implicit Subjective and Sentimental Usages in Multi-sense Word Embeddings.

Do Multi-Sense Embeddings Improve Natural Language Understanding?

Constructing High Quality Sense-specific Corpus and Word Embedding Via Unsupervised Elimination of Pseudo Multi-sense.

Understanding and Improving Multi-Sense Word Embeddings via Extended Robust Principal Component Analysis

Real Multi-Sense or Pseudo Multi-Sense: an Approach to Improve Word Representation

Beyond Bilingual: Multi-sense Word Embeddings using Multilingual Context

On Modeling Sense Relatedness in Multi-prototype Word Embedding.

Context-aware Sentiment Word Identification: Sentiword2vec.

Leveraging Human Prior Knowledge to Learn Sense Representations

Contextualized Word Embeddings Encode Aspects of Human-Like Word Sense Knowledge

Disentangling Latent Emotions of Word Embeddings on Complex Emotional Narratives

Learning Sense-specific Word Embeddings By Exploiting Bilingual Resources.

Refined Global Word Embeddings Based on Sentiment Concept for Sentiment Analysis

Disambiguating Sentiment Ambiguous Adjectives

Chinese Word Sense Embedding with SememeWSD and Synonym Set

Analysis of literal and metaphorical senses based on diachronic word embeddings

Learning Word Sense Embeddings from Word Sense Definitions

Multi-sense Definition Modeling using Word Sense Decompositions

Implicit Sentiment Analysis of Chinese Texts based on Contextual Information and Knowledge Enhancement

Addressing the Polysemy Problem in Language Modeling with Attentional Multi-Sense Embeddings

Learning Bilingual Embedding Model for Cross-Language Sentiment Classification