Using Linguistic Features to Estimate Suicide Probability of Chinese Microblog Users

Lei Zhang,Xiaolei Huang,Tianli Liu,Zhenxiang Chen,Tingshao Zhu
DOI: https://doi.org/10.48550/arXiv.1411.0861
2014-11-04
Abstract:If people with high risk of suicide can be identified through social media like microblog, it is possible to implement an active intervention system to save their lives. Based on this motivation, the current study administered the Suicide Probability Scale(SPS) to 1041 weibo users at Sina Weibo, which is a leading microblog service provider in China. Two NLP (Natural Language Processing) methods, the Chinese edition of Linguistic Inquiry and Word Count (LIWC) lexicon and Latent Dirichlet Allocation (LDA), are used to extract linguistic features from the Sina Weibo data. We trained predicting models by machine learning algorithm based on these two types of features, to estimate suicide probability based on linguistic features. The experiment results indicate that LDA can find topics that relate to suicide probability, and improve the performance of prediction. Our study adds value in prediction of suicidal probability of social network users with their behaviors.
Social and Information Networks,Computation and Language
What problem does this paper attempt to address?