Leveraging Interactive Knowledge And Unlabeled Data In Gender Classification With Co-Training

Jingjing Wang,Yunxia Xue,Shoushan Li,Guodong Zhou
DOI: https://doi.org/10.1007/978-3-319-22324-7_23
2015-01-01
Abstract:Conventional approaches to gender classification much rely on a large scale of labeled data, which is normally hard and expensive to obtain. In this paper, we propose a co-training approach to address this problem in gender classification. Specifically, we employ both non-interactive and interactive texts, i.e., the message and comment texts, as two different views in our co-training approach to well incorporate unlabeled data. Experimental results on a large data set from micro-blog demonstrate the appropriateness of leveraging interactive knowledge in gender classification and the effectiveness of the proposed co-training approach in gender classification.
What problem does this paper attempt to address?