An Approach for Identifying Author Profiles of Blogs.

Chunxia Zhang,Yu Guo,Jiayu Wu,Shuliang Wang,Zhendong Niu,Wen Cheng
DOI: https://doi.org/10.1007/978-3-319-69179-4_33
2017-01-01
Abstract:Author profile identification has been an important research problem in the areas of web mining, network public opinion monitoring and social network analysis. The aim of this problem is to identify characteristics or traits of authors of textual information such as blogs, microblogs or reviews in social network platforms or commercial platforms. The technology of author profile identification can be employed into many applications including cyberspace forensics, electronic commerce and information security. In this paper, we propose a hybrid framework or technique to solve the author profile identification problem. In this framework, we design a distributed integrated representation approach of blogs based on Doc2vec and term frequency-inverse document frequency, and apply the convolutional neural network to predict age, gender and education status of authors of blogs. The benefit of our technique is that it predicts three different traits of authors in a uniform way, is an unsupervised method which can learn representation vectors of blog posts based on unlabeled data, and does not need any syntactic and semantic parsing of sentences. Experimental results on blogs show that our approach achieves a promising performance.
What problem does this paper attempt to address?