User Gender Classification in Chinese Microblog

WANG Jingjing,LI Shoushan,HUANG Lei
DOI: https://doi.org/10.3969/j.issn.1003-0077.2014.06.021
2014-01-01
Abstract:This paper focused on classifying the users into male and female with the information provided by Chinese Micro-blog. Although some researchers have devoted their efforts on gender classification, there is still a lack of researches in Chinese gender classification. In this paper, firstly, a classification method using user names or messages (sent by the users) to recognize male and female was proposed. Different types of features (e.g., character and word features) were investigated to perform the classification; Secondly, on the basis of the two classifiers trained with user names and messages, Bayes rule was employed to combine the two classifiers so as to make the prediction with classification knowledge from both the user names and messages. Experimental results demonstrate that the proposed approach yields a nice performance to gender classification, and the combination method outperforms the individual classifier trained with only user names or messages.
What problem does this paper attempt to address?