News Title Classification with Support from Auxiliary Long Texts.

Yuanxin Ouyang,Yao Huangfu,Hao Sheng,Zhang Xiong
DOI: https://doi.org/10.1007/978-3-319-12640-1_70
2014-01-01
Abstract:The performance of short text classification is limited due to its intrinsic shortness of sentences which causes the sparseness of vector space model. Traditional classifiers like SVM are extremely sensitive to the features space, thereby making classification performance unsatisfying in short text related applications. It is believed that using external information to help better represent input data would possibly yield satisfying results. In this paper, we target on the problem of news title classification which is an essential and typical member in short text family and propose an approach which employs external information from long text to address the problem the sparseness. Afterwards Restricted Boltzman Machine are utilised to select features and then finally perform classification using Support Vector Machine. The experimental study on Reuters-21578 and Sogou Chinese news corpus has demonstrates the effectiveness of the proposed method.
What problem does this paper attempt to address?