Will Sentiment Analysis Need Subculture? A New Data Augmentation Approach

Zhenhua Wang,Simin He,Guang Xu,Ming Ren
2023-09-01
Abstract:The renowned proverb that "The pen is mightier than the sword" underscores the formidable influence wielded by text expressions in shaping sentiments. Indeed, well-crafted written can deeply resonate within cultures, conveying profound sentiments. Nowadays, the omnipresence of the Internet has fostered a subculture that congregates around the contemporary milieu. The subculture artfully articulates the intricacies of human feelings by ardently pursuing the allure of novelty, a fact that cannot be disregarded in the sentiment analysis. This paper strives to enrich data through the lens of subculture, to address the insufficient training data faced by sentiment analysis. To this end, a new approach of subculture-based data augmentation (SCDA) is proposed, which engenders six enhanced texts for each training text by leveraging the creation of six diverse subculture expression generators. The extensive experiments attest to the effectiveness and potential of SCDA. The results also shed light on the phenomenon that disparate subculture expressions elicit varying degrees of sentiment stimulation. Moreover, an intriguing conjecture arises, suggesting the linear reversibility of certain subculture expressions. It is our fervent aspiration that this study serves as a catalyst in fostering heightened perceptiveness towards the tapestry of information, sentiment and culture, thereby enriching our collective understanding.
Computation and Language
What problem does this paper attempt to address?
The paper attempts to address the issue of insufficient training data in sentiment analysis. Specifically, it proposes a new data augmentation method (SCDA) that enriches the training dataset by leveraging subcultural expressions. The main goal is to improve the accuracy and performance of sentiment analysis models, especially when dealing with the diverse forms of expression in modern social media. The paper points out that existing data augmentation strategies often overlook the unique expressions found in subcultures, which can convey emotions more vividly. Therefore, by introducing a data augmentation method that incorporates subcultural expressions, it is possible to better capture the emotional variations of different groups and enhance the model's performance in practical applications. Additionally, the paper explores the varying degrees of impact that different subcultural expressions have on emotional stimuli and hypothesizes that certain subcultural expressions may exhibit linear reversibility.