Sarcasm detection in social media based on imbalanced classification

Liu Peng,Chen Wei,Ou Gaoyan,Wang Tengjiao,Yang Dongqing,Lei Kai
DOI: https://doi.org/10.1007/978-3-319-08010-9_49
2014-01-01
Abstract:Sarcasm is a pervasive linguistic phenomenon in online documents that express subjective and deeply-felt opinions. Detection of sarcasm is of great importance and beneficial to many NLP applications, such as sentiment analysis, opinion mining and advertising. Current studies consider automatic sarcasm detection as a simple text classification problem. They do not use explicit features to detect sarcasm and ignore the imbalance between sarcastic and non-sarcastic samples in real applications. In this paper, we first explore the characteristics of both English and Chinese sarcastic sentences and introduce a set of features specifically for detecting sarcasm in social media. Then, we propose a novel multi-strategy ensemble learning approach(MSELA) to handle the imbalance problem. We evaluate our proposed model on English and Chinese data sets. Experimental results show that our ensemble approach outperforms the state-of-the-art sarcasm detection approaches and popular imbalanced classification methods. © 2014 Springer International Publishing Switzerland.
What problem does this paper attempt to address?