Semi-Automatic Creation of Youth Slang Corpus and Its Application to Affective Computing

Fuji Ren,Kazuyuki Matsumoto
DOI: https://doi.org/10.1109/taffc.2015.2457915
IF: 13.99
2015-01-01
IEEE Transactions on Affective Computing
Abstract:This paper proposes a method to semi-automatically construct a corpus that includes Japanese youth slang called Wakamono Kotoba. The process of semi-automatic corpus construction is composed of the first step is automatic collection of example sentence, the second step is tag annotation to collected sentences, and the final step is manually modifying tag and noise reduction. In this process, there are two problems. The first problem is quality of the automatic collected corpora. The second is the accuracy of tag annotation. If the automatically annotated tags are unreliable, after all, it takes long time to modify them manually. As a solution of the first problem, we proposed a filtering method to remove meaningless sentences (noise sentences) automatically. In order to solve a second problem, we proposed an emotion estimation method that can be applied to the sentences that included youth slang and were difficult to be analyzed automatically. The result of the accuracy evaluation showed improvement in F1-Score compared to the machine learning method and confirms the effectiveness of the proposed method.
What problem does this paper attempt to address?