Feature Importance Analysis for Spammer Detection in Sina Weibo

ZHANG Yu-xiang,SUN Yu,YANG Jia-hai,ZHOU Da-lei,MENG Xiang-fei,XIAO Chun-jing
DOI: https://doi.org/10.11959/j.issn.1000-436x.2016152
2016-01-01
Abstract:Microblog has drawn attention of not only legitimate users but also spammers.The garbage information pro-vided by spammers handicaps users' experience significantly.In order to improve the detection accuracy of spammers,most existing studies on spam focus on generating more classification features or putting forward new classifiers.Which kind of issues would be put the high priority of an enormous amount of research effort into? Are extensive features or novel classifiers better for the detection accuracy of spammers? It is tried to address these questions through combining different feature selection methods with different classifiers on a real Sina Weibo dataset.Experimental results show that selected features are more important than novel classifiers for spammer detection.In addition,features should be derived from a wide range,such as text contents,user behaviors,and social relationship,and the dimension of features should not be too high.These results will be useful in finding the breakpoint of Microblog anti-spam works in the future.
What problem does this paper attempt to address?