Feature Based Depression Detection from Twitter Data Using Machine Learning Techniques

Piyush Kumar,Poulomi Samanta,Suchandra Dutta,Moumita Chatterjee,Dhrubasish Sarkar
DOI: https://doi.org/10.37398/jsr.2022.660229
2022-01-01
Journal of Scientific Research
Abstract:The statistics presented by the World Health Organization attribute depression to be a primary cause of concern globally, leading to suicide in the majority of the cases if left undetected. Studies show that depression generally has an impact on the writing style and corresponding language use. The primary aim of the proposed research is to study users’ posts on Twitter and identify the attributes that may indicate depressive symptoms of online users. The paper employed machine learning approaches and natural language processing techniques for training our data and evaluating the efficiency of our proposed method. The work proposed a numerical score for each user based on the sentiment value of their tweets and demonstrated that this feature can detect depression with an accuracy of 78% with the XGBoost classifier. This attribute is combined with other Linguistic features (N-Gram+TF-IDF) and LDA to achieve an accuracy of 89% using the Support Vector Machine classifier. According to the proposed research, proper feature selection and their combinations help in achieving better improvement in performance.
English Else
What problem does this paper attempt to address?