Abstract:It is important for machines to interpret human emotions properly for better human-machine communications, as emotion is an essential part of human-to-human communications. One aspect of emotion is reflected in the language we use. How to represent emotions in texts is a challenge in natural language processing (NLP). Although continuous vector representations like word2vec have become the new norm for NLP problems, their limitations are that they do not take emotions into consideration and can unintentionally contain bias toward certain identities like different genders. This thesis focuses on improving existing representations in both word and sentence levels by explicitly taking emotions inside text and model bias into account in their training process. Our improved representations can help to build more robust machine learning models for affect-related text classification like sentiment/emotion analysis and abusive language detection. We first propose representations called emotional word vectors (EVEC), which is learned from a convolutional neural network model with an emotion-labeled corpus, which is constructed using hashtags. Secondly, we extend to learning sentence-level representations with a huge corpus of texts with the pseudo task of recognizing emojis. Our results show that, with the representations trained from millions of tweets with weakly supervised labels such as hashtags and emojis, we can solve sentiment/emotion analysis tasks more effectively. Lastly, as examples of model bias in representations of existing approaches, we explore a specific problem of automatic detection of abusive language. We address the issue of gender bias in various neural network models by conducting experiments to measure and reduce those biases in the representations in order to build more robust classification models.

Learning Sentence Representation for Emotion Classification on Microblogs.

Emotion Detection in Online Social Networks: A Multilabel Learning Approach

Exploring Spatio-Temporal Representations by Integrating Attention-based Bidirectional-LSTM-RNNs and FCNs for Speech Emotion Recognition

Emotion recognition of social media users based on deep learning

An Empirical Study of Emotion Analysis with Different Distributed Representation Methods for Chinese Microblogs

Deep learning based emotion analysis of microblog texts

SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering

Leveraging distant supervision and deep learning for twitter sentiment and emotion classification

Emotion Classification in Microblog Texts Using Class Sequential Rules

Semisupervised Learning of Author‐specific Emotions in Micro‐blogs

Raw Tweets Emoji Tweets Word 2 Vec Sentence Representation Model Word Embeddings Representation Learning : Source Language Representation Learning : Target Language Labeled English Docs Supervised Learning Emoji Tweets Word 2 Vec Word EmbeddingsMachine Translate Classification Model

Social Media Opinion Summarization Using Emotion Cognition and Convolutional Neural Networks

Construction Of Microblog-Specific Chinese Sentiment Lexicon Based On Representation Learning

Fine-grained emotion classification of Chinese microblogs based on graph convolution networks

Text-based emotion classification using emotion cause extraction

Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification.

Attention-Based BiLSTM Network with Lexical Feature for Emotion Classification

Finding Good Representations of Emotions for Text Classification

Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification

Deep Convolution Neural Networks for Twitter Sentiment Analysis.

An iterative emotion classification approach for microblogs