Discriminating gender on Chinese microblog: A study of online behaviour, writing style and preferred vocabulary

Li Li,Maosong Sun,Zhiyuan Liu
DOI: https://doi.org/10.1109/ICNC.2014.6975942
2014-01-01
Abstract:As user attributes are useful for applications such as personalized recommendation, adverting and so on, user attribute predication on Twitter has attracted intensive attentions in recent years. Although Chinese micro-blogging services are different from Twitter on various aspects such as language, user behaviours and so on, few efforts have been made on Chinese micro-blogging services. In this paper, we propose a gender prediction model for Chinese microblog which exploits features including online behaviour, writing style, and preferred vocabulary. Experimental results on Sina Weibo, which is one of the most popular micro-blogging services in China, show that our model achieves the state-of-the-art accuracy 94.3%. We also find significant distinctions between male and female microblog users on online behaviour, writing style and preferred vocabulary, which would be helpful for improving personalized applications.
What problem does this paper attempt to address?