A Multiple Feature Integration Model To Infer Occupation From Social Media Records

Xiang Wang,Lele Yu,Junjie Yao,Bin Cui
DOI: https://doi.org/10.1007/978-3-642-41154-0_10
2013-01-01
Abstract:With the rapid development of more and more social media applications, lots of users are connected with friends and their daily life and opinions are recorded. Social media provides us an unprecedented way to collect and analyze billions of users' information. Proper user attribute identification or profile inference becomes more and more attractive and feasible. However, the flourishing social records also pose great challenge in effective feature selection and integration for user profile inference. This is mainly caused by the text sparsity and complex community structures.In this paper, we propose a comprehensive framework to infer user's occupation from his/her social activities recorded in micro-blog message streams. A multi-source integrated classification model is set up with some fine selected features. We first identify some beneficial basic content features, and then we proceed to tailor a community discovery based latent dimension solution to extract community features.Extensive empirical studies are conducted on a large real micro-blog dataset. Not only we demonstrate the integrated model shows advantages over several baseline methods, but also we verify the effect of homophily in users' interaction records. The different effects of heterogeneous interactive networks are also revealed.
What problem does this paper attempt to address?